0% found this document useful (0 votes)

54 views8 pages

Open Data Cube

The document provides a high-level overview of the Open Data Cube (ODC) architecture and ecosystem. It describes the ODC core as a common analytical framework that facilitates the organization and analysis of large geospatial datasets. It also outlines the major components of the ODC including downloading/indexing data locally, ingesting the data, and loading the data for analysis. Finally, it briefly describes some popular open-source tools that interface with the ODC core to enable analysis of satellite imagery and other geospatial data.

Uploaded by

Ajan Senshil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views8 pages

Open Data Cube

Uploaded by

Ajan Senshil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Architecture and Ecosystem

A High-Level Overview
Architecture and Ecosystem | 2

Summary
The Open Data Cube (ODC) is an open source solution for accessing, managing, and analyzing large quantities of Geographic
Information System (GIS) data - namely Earth observation (EO) data. It presents a common analytical framework composed
of a series of data structures and tools which facilitate the organization and analysis of large gridded data collections. The
Open Data Cube was developed for the analysis of temporally-rich earth observation data, however the flexibility of the
platform also allows other gridded data collections to be included and analyzed. Such data may include elevation models,
geophysical grids, interpolated surfaces and model outputs. A key characteristic of the Open Data Cube is that every unique
observation is kept, which contrasts with many other methods used to handle large gridded data collections. Some of the
major advantages of ODC are the following:

• Flexible framework
• User maintains control and ownership over their data
• Paradigm shift from scene-based analysis to pixel based
• Lower barrier to entry for remote sensing data analysis.

In this document, we briefly describe and illustrate the high-level architecture and ecosystem of the ODC framework in order
to provide a better understanding to those who are new to ODC. This document only covers major components of the ODC
and the relationships between them.

CONTENTS

Summary 2

High-Level ODC Overview 3

High-Level ODC Ecosystem 4

Download Data Locally and Index 5

Download Data Locally and Ingest 6

Load Data 7

Acronyms 8
Architecture and Ecosystem | 3

High-Level ODC Overview

The ODC core1 serves as a layer between satellite data and end user applications. It provides a common analytical framework
to allow multiple data sources to produce information for multiple uses. The ODC can handle data from any satellite data
provider. The ODC eliminates the need for difficult and time-consuming pre-processing of the data from individual
applications. This allows an increased capacity for development of information products by the Earth Observation (EO)
community, and increased value for the public from EO information. Figure 1 illustrates data from many satellite data
providers being managed by an ODC system.

Figure 1: High-Level Open Data Cube Overview

Several international space agencies provide data and make provisions to supply this data in an Analysis Ready Data (ARD)
format for immediate application. Figure 1 illustrates a diverse set of data being managed by an ODC core system. The ODC
core system is then used as a simplified basis on which end users conduct analysis using ODC compatible analysis tools.

1
https://github.com/opendatacube/datacube-core
Architecture and Ecosystem | 4

High-Level ODC Ecosystem

As stated earlier, the ODC core serves as a layer between satellite data providers and applications. A set of open source tools
exist to help scientists conduct research using data managed by the ODC. Figure 2 illustrates popular tools used within the
community that utilizes the ODC Core as its basis.

• Command Line Tools: A tool used by programmers/developers to interface with the ODC.
• Open Data Cube Explorer: A visual and interactive web application that lets users explore their inventory of available
data.
• Open Data Cube Stats: An optimized means of defining and executing advanced analysis on ODC system. This tool
is oriented towards scientists.
• Web User Interface (UI): A web application that allows developers to interactively showcase and visualize the output
of algorithms.
• Jupyter Notebooks: Research documents centered around techniques in EO sciences. A notebook contains
executable code detailing examples of how the data cube is used in a research setting, and therefore is an invaluable
reference material for new users.
• Open Geospatial Consortium (OGC) Web Services: Adapters that can connect non-ODC applications to the ODC.

Figure 2: High-Level ODC Ecosystem

Architecture and Ecosystem | 5

Download Data Locally and Index

In the previous section, Figure 2 showed that the ODC framework can make data accessible to a sizeable ecosystem of
applications. The following section briefly covers a process called indexing. Described plainly, indexing is about making the
ODC aware of the existence of imagery. In the process of indexing data, the ODC tracks information that is useful when it
comes to loading imagery, searching for imagery, or performing more advanced operations such as realigning imagery. The
data can be either downloaded locally or stored in the cloud. In this section, we describe the process of indexing where data
is downloaded locally.

Here is a brief outline of the indexing process:

1. As shown in Figure 3, the first step in this process is to describe the source of the imagery. We include basic details
about which sensor the data comes from, what format to expect the data in, as well as its measurements, e.g. bands.
This is done by drafting a document called a product definition for each data type. This product definition is then
added to the system. Adding a product definition enables the system to accept that product.
2. The second step in the process is about extracting details from an individual satellite image. This is called the data
preparation step. Scripts are available to extract information or metadata from many types of images.
3. The data extracted in step 2 typically includes date and time of acquisition, spatial bounds, etc. as metadata. In the
third step, called indexing, metadata (documents) are indexed into the ODC’s database. Most importantly, the
process stores the location of the data within a local system.

Figure 3: Download Data Locally and Index

Architecture and Ecosystem | 6

Download Data Locally and Ingest

In the previous section, we briefly touched on the process of indexing. In this section, we will discuss the process called
ingestion. Ingestion is a process that takes indexed data and performs some operations to turn it into a new file format or
structure. This optimisation step can increase the efficiency of data storage and retrieval. For example, there are
significant improvements when converting downloaded GeoTIFFs to a format like NetCDF. Ingestion also splits large scene
files into several smaller tiles to help organize large multidimensional datasets for both fast and flexible data access.
Geospatial transformations can also be defined in the ingestion process. The ingestion process can be configured using an
ingestion configuration to reformat and apply geospatial transforms to the data. Figure 4 illustrates the ingestion process.
The indexed imagery is transformed and reformatted per ingestion configuration specifications. It may involve resampling,
reprojection, repackaging, recompression, etc. The newly formatted/transformed data is reindexed in the database under
a new product name that gets registered in the database.

Figure 4: Download Data Locally and Ingest

Architecture and Ecosystem | 7

Load Data
User-supplied query parameters are used as a lookup into the metadata database in order to determine which datasets hold
data requested by the user. Those datasets are then grouped and ordered, and the actual data is loaded from the file system.
The resulting data is organized into an Xarray Dataset with appropriate temporal-spatial dimensions and separate data
variables for each band.

Figure 5: Load Data

Architecture and Ecosystem | 8

Acronyms
ARD Analysis Ready Data

EO Earth Observation

DB Database

GA Geoscience Australia

GDAL Geospatial Data Abstraction Library

GIS Geographic Information System

NetCDF Network Common Data Form

ODC Open Data Cube

OGC Open Geospatial Consortium

USGS United States Geological Survey

Data Communications and Networking
100% (1)
Data Communications and Networking
20 pages
ZoomRx - Associate, Consulting India - Campus Batch '24
No ratings yet
ZoomRx - Associate, Consulting India - Campus Batch '24
4 pages
Data Visualisation: The What, Why and How
67% (3)
Data Visualisation: The What, Why and How
11 pages
Remote Sensing: An Overview of Platforms For Big Earth Observation Data Management and Analysis
No ratings yet
Remote Sensing: An Overview of Platforms For Big Earth Observation Data Management and Analysis
25 pages
CDB - Open Geospatial Consortium
No ratings yet
CDB - Open Geospatial Consortium
5 pages
Ecodatacube Eu
No ratings yet
Ecodatacube Eu
30 pages
Applsci 10 00856 v2
No ratings yet
Applsci 10 00856 v2
32 pages
Database Management System Answer Key - Activity 1
No ratings yet
Database Management System Answer Key - Activity 1
10 pages
DR Ayo - Patriarchy and Female Objectification in Selected Proverbs of Pete Edochie
No ratings yet
DR Ayo - Patriarchy and Female Objectification in Selected Proverbs of Pete Edochie
41 pages
Fusion Technical
100% (1)
Fusion Technical
544 pages
Bringing Open Data Cube Into Practice - Workshop Material: January 2019
No ratings yet
Bringing Open Data Cube Into Practice - Workshop Material: January 2019
66 pages
Wachemo University College of Social Science and Humanities Department of History and Heritage Management
100% (1)
Wachemo University College of Social Science and Humanities Department of History and Heritage Management
19 pages
(Big Data Analytics With PySpark) (CheatSheet)
No ratings yet
(Big Data Analytics With PySpark) (CheatSheet)
7 pages
Pre Processing and Analysis Tools
No ratings yet
Pre Processing and Analysis Tools
23 pages
Synopsis "Time Series Geospatial Big Data Analysis Using Array Database"
No ratings yet
Synopsis "Time Series Geospatial Big Data Analysis Using Array Database"
5 pages
BSC6900 GSM V900R012 Dimensioning
No ratings yet
BSC6900 GSM V900R012 Dimensioning
88 pages
Borang RPM Sains T1 Edited
No ratings yet
Borang RPM Sains T1 Edited
53 pages
Manual Plaxis Network
No ratings yet
Manual Plaxis Network
11 pages
Data Processing and Analysis: Chapter Six
No ratings yet
Data Processing and Analysis: Chapter Six
39 pages
How CDC Refresh Works Mar 2010 EXT
No ratings yet
How CDC Refresh Works Mar 2010 EXT
9 pages
Chapter 1: Introduction: Database System Concepts, 5th Ed
No ratings yet
Chapter 1: Introduction: Database System Concepts, 5th Ed
31 pages
Week 1: Database Introduction
No ratings yet
Week 1: Database Introduction
6 pages
Managing Oracle Database Instance
No ratings yet
Managing Oracle Database Instance
12 pages
Guide To International Undergraduate Programs
No ratings yet
Guide To International Undergraduate Programs
30 pages
Polars Vs Pandas - Benchmarking Performances and Beyond - LinkedIn
No ratings yet
Polars Vs Pandas - Benchmarking Performances and Beyond - LinkedIn
12 pages
Chapter 7
No ratings yet
Chapter 7
12 pages
Stat Chapter 1
No ratings yet
Stat Chapter 1
28 pages
Research Methodology
No ratings yet
Research Methodology
20 pages
LAB Buggy
No ratings yet
LAB Buggy
2 pages
Pengaruh Penggunaan Media Video Edukasi Terhadap Hasil Belajar PKN Siswa Kelas Vii SMP Negeri 2 Gondangrejo
No ratings yet
Pengaruh Penggunaan Media Video Edukasi Terhadap Hasil Belajar PKN Siswa Kelas Vii SMP Negeri 2 Gondangrejo
13 pages
Pivot Tables
No ratings yet
Pivot Tables
14 pages
Final Exam Study Guide IT109
No ratings yet
Final Exam Study Guide IT109
3 pages
Azure Storage
No ratings yet
Azure Storage
4 pages
Naukri NarayanaT (2y 1m)
No ratings yet
Naukri NarayanaT (2y 1m)
4 pages
Mcse 301 A Data Warehousing and Mining Jun 2020
No ratings yet
Mcse 301 A Data Warehousing and Mining Jun 2020
2 pages
Power Bi Notes 1749806995
No ratings yet
Power Bi Notes 1749806995
15 pages
Cloud Iot-1.2
No ratings yet
Cloud Iot-1.2
5 pages
The Study of Building the Data Warehouse
From Everand
The Study of Building the Data Warehouse
venkateswara Rao
No ratings yet
Information Storage and Management: Storing, Managing, and Protecting Digital Information in Classic, Virtualized, and Cloud Environments
From Everand
Information Storage and Management: Storing, Managing, and Protecting Digital Information in Classic, Virtualized, and Cloud Environments
EMC Education Services
No ratings yet
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
From Everand
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
AJIT DASH
2/5 (2)
Administering ArcGIS for Server
From Everand
Administering ArcGIS for Server
Hussein Nasser
No ratings yet
Getting Started with Oracle Data Integrator 11g: A Hands-On Tutorial
From Everand
Getting Started with Oracle Data Integrator 11g: A Hands-On Tutorial
David Hecksel
5/5 (2)
ELT Architecture and Implementation: Definitive Reference for Developers and Engineers
From Everand
ELT Architecture and Implementation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Deploying and Managing Applications with DigitalOcean: Definitive Reference for Developers and Engineers
From Everand
Deploying and Managing Applications with DigitalOcean: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
From Everand
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Byron Ellis
No ratings yet
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
From Everand
Building Modern Data Applications Using Databricks Lakehouse: Develop, optimize, and monitor data pipelines on Databricks
Will Girten
No ratings yet
CrateDB for IoT and Machine Data: The Complete Guide for Developers and Engineers
From Everand
CrateDB for IoT and Machine Data: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
Practical NetCDF Techniques: Definitive Reference for Developers and Engineers
From Everand
Practical NetCDF Techniques: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
GeoNode Deployment and Administration: Definitive Reference for Developers and Engineers
From Everand
GeoNode Deployment and Administration: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
StreamSets Pipeline Design and Best Practices: Definitive Reference for Developers and Engineers
From Everand
StreamSets Pipeline Design and Best Practices: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Logstash Essentials: Definitive Reference for Developers and Engineers
From Everand
Logstash Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Resoto for Cloud Resource Automation: The Complete Guide for Developers and Engineers
From Everand
Resoto for Cloud Resource Automation: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
NetBSD Systems and Architecture: Definitive Reference for Developers and Engineers
From Everand
NetBSD Systems and Architecture: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Efficient Data Querying with Drill: Definitive Reference for Developers and Engineers
From Everand
Efficient Data Querying with Drill: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
OpenBSD Reference Guide: Definitive Reference for Developers and Engineers
From Everand
OpenBSD Reference Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
WorldWind Development Essentials: Definitive Reference for Developers and Engineers
From Everand
WorldWind Development Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Oracle Data Integrator Essentials: Definitive Reference for Developers and Engineers
From Everand
Oracle Data Integrator Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
PrestoDB in Practice: Definitive Reference for Developers and Engineers
From Everand
PrestoDB in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Sqoop Essentials: Definitive Reference for Developers and Engineers
From Everand
Sqoop Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Coralogix Essentials: Definitive Reference for Developers and Engineers
From Everand
Coralogix Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Comprehensive Guide to Glue for Scientific Data Exploration: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Glue for Scientific Data Exploration: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Aerospike Architecture and Implementation: Definitive Reference for Developers and Engineers
From Everand
Aerospike Architecture and Implementation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
OpenWrt Essentials: Definitive Reference for Developers and Engineers
From Everand
OpenWrt Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
DB2 Administration and Optimization Guide: Definitive Reference for Developers and Engineers
From Everand
DB2 Administration and Optimization Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Couchbase Essentials: Definitive Reference for Developers and Engineers
From Everand
Couchbase Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Practical HTCondor Administration: Definitive Reference for Developers and Engineers
From Everand
Practical HTCondor Administration: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Cohesity Architecture and Administration: Definitive Reference for Developers and Engineers
From Everand
Cohesity Architecture and Administration: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
NetFlow Protocols and Applications: Definitive Reference for Developers and Engineers
From Everand
NetFlow Protocols and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Debezium in Action: Definitive Reference for Developers and Engineers
From Everand
Debezium in Action: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Comprehensive Guide to Azure HDInsight: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Azure HDInsight: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Distributed Cluster Operations with DC/OS: Definitive Reference for Developers and Engineers
From Everand
Distributed Cluster Operations with DC/OS: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Jaeger Distributed Tracing in Practice: Definitive Reference for Developers and Engineers
From Everand
Jaeger Distributed Tracing in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
InfluxDB Essentials: Definitive Reference for Developers and Engineers
From Everand
InfluxDB Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
OpenTelemetry in Practice: Definitive Reference for Developers and Engineers
From Everand
OpenTelemetry in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Practical Observability Engineering with Relic: Definitive Reference for Developers and Engineers
From Everand
Practical Observability Engineering with Relic: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Data Pipeline Automation with Airbyte: Definitive Reference for Developers and Engineers
From Everand
Data Pipeline Automation with Airbyte: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Superset Data Exploration and Analysis Framework: Definitive Reference for Developers and Engineers
From Everand
Superset Data Exploration and Analysis Framework: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Blue-Green Deployment Engineering: Definitive Reference for Developers and Engineers
From Everand
Blue-Green Deployment Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Snowflake Data Platform Engineering: Definitive Reference for Developers and Engineers
From Everand
Snowflake Data Platform Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Contiki Operating System for Embedded IoT: Definitive Reference for Developers and Engineers
From Everand
Contiki Operating System for Embedded IoT: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Netdata in Practice: Definitive Reference for Developers and Engineers
From Everand
Netdata in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering Delta Lake: Optimizing Data Lakes for Performance and Reliability
From Everand
Mastering Delta Lake: Optimizing Data Lakes for Performance and Reliability
Robert Johnson
No ratings yet
Comprehensive Guide to Zipkin: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Zipkin: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
The InfluxDB Handbook: Deploying, Optimizing, and Scaling Time Series Data
From Everand
The InfluxDB Handbook: Deploying, Optimizing, and Scaling Time Series Data
Robert Johnson
No ratings yet
Mastering OpenTelemetry: Building Scalable Observability Systems for Cloud-Native Applications
From Everand
Mastering OpenTelemetry: Building Scalable Observability Systems for Cloud-Native Applications
Robert Johnson
No ratings yet
Airflow for Data Workflow Automation
From Everand
Airflow for Data Workflow Automation
Richard Johnson
No ratings yet
NB-IoT Systems and Protocols: Definitive Reference for Developers and Engineers
From Everand
NB-IoT Systems and Protocols: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
Application Design: Key Principles For Data-Intensive App Systems
From Everand
Application Design: Key Principles For Data-Intensive App Systems
Rob Botwright
No ratings yet

Open Data Cube

Uploaded by

Open Data Cube

Uploaded by

Architecture and Ecosystem

High-Level ODC Overview 3

High-Level ODC Ecosystem 4

Download Data Locally and Index 5

Download Data Locally and Ingest 6

High-Level ODC Overview

Figure 1: High-Level Open Data Cube Overview

High-Level ODC Ecosystem

Figure 2: High-Level ODC Ecosystem

Download Data Locally and Index

Here is a brief outline of the indexing process:

Figure 3: Download Data Locally and Index

Download Data Locally and Ingest

Figure 4: Download Data Locally and Ingest

Figure 5: Load Data

GDAL Geospatial Data Abstraction Library

GIS Geographic Information System

NetCDF Network Common Data Form

ODC Open Data Cube

OGC Open Geospatial Consortium

USGS United States Geological Survey

You might also like