Netcdf
Netcdf
Data Model, Programming Interfaces, and Format for Self-Describing, Portable Data
NetCDF Version 4.0.1
March 2009
Russ Rew, Glenn Davis, Steve Emmerson, Harvey Davies, and Ed Hartne
Unidata Program Center
Copyright c 2005-2006 University Corporation for Atmospheric Research
Permission is granted to make and distribute verbatim copies of this manual provided that
the copyright notice and these paragraphs are preserved on all copies. The software and any
accompanying written materials are provided “as is” without warranty of any kind. UCAR
expressly disclaims all warranties of any kind, either expressed or implied, including but not
limited to the implied warranties of merchantability and fitness for a particular purpose.
The Unidata Program Center is managed by the University Corporation for Atmospheric
Research and sponsored by the National Science Foundation. Any opinions, findings, con-
clusions, or recommendations expressed in this publication are those of the author(s) and
do not necessarily reflect the views of the National Science Foundation.
Mention of any commercial company or product in this document does not constitute an
endorsement by the Unidata Program Center. Unidata does not authorize any use of
information from this publication for advertising or publicity purposes.
i
Table of Contents
Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1 The NetCDF Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 NetCDF Is Not a Database Management System . . . . . . . . . . . . . . . 5
1.3 The netCDF File Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 How to Select the Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4.1 NetCDF Classic Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.2 NetCDF 64-bit Offset Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.3 NetCDF-4 Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 What about Performance?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 Is NetCDF a Good Archive Format? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.7 Creating Self-Describing Data conforming to Conventions . . . . . . . 8
1.8 Background and Evolution of the NetCDF Interface . . . . . . . . . . . . 9
1.9 What’s New Since the Previous Release? . . . . . . . . . . . . . . . . . . . . . . 12
1.10 Limitations of NetCDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.11 Plans for NetCDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.12 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.1 NetCDF External Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2 Data Structures in Classic and 64-bit Offset Files . . . . . . . . . . . . . . 26
3.3 NetCDF-4 User Defined Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3.1 Compound Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3.2 VLEN Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3.3 Opaque Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3.4 Enum Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3.5 Groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4 Data Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
ii The NetCDF Users’ Guide
5 NetCDF Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.1 CDL Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.2 CDL Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.3 CDL Notation for Data Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.4 ncgen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.5 ncdump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.6 ncgen4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Appendix A Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Foreword 1
Foreword
Unidata (http://www.unidata.ucar.edu) is a National Science Foundation-sponsored pro-
gram empowering U.S. universities, through innovative applications of computers and net-
works, to make the best use of atmospheric and related data for enhancing education and
research. For analyzing and displaying such data, the Unidata Program Center offers uni-
versities several supported software packages developed by other organizations. Underlying
these is a Unidata-developed system for acquiring and managing data in real time, making
practical the Unidata principle that each university should acquire and manage its own data
holdings as local requirements dictate. It is significant that the Unidata program has no
data center–the management of data is a "distributed" function.
The Network Common Data Form (netCDF) software described in this guide was orig-
inally intended to provide a common data access method for the various Unidata applica-
tions. These deal with a variety of data types that encompass single-point observations,
time series, regularly-spaced grids, and satellite or radar images.
The netCDF software functions as an I/O library, callable from C, FORTRAN, C++,
Perl, or other language for which a netCDF library is available. The library stores and
retrieves data in self-describing, machine-independent datasets. Each netCDF dataset can
contain multidimensional, named variables (with differing types that include integers, reals,
characters, bytes, etc.), and each variable may be accompanied by ancillary data, such as
units of measure or descriptive text. The interface includes a method for appending data
to existing netCDF datasets in prescribed ways, functionality that is not unlike a (fixed
length) record structure. However, the netCDF library also allows direct-access storage and
retrieval of data by variable name and index and therefore is useful only for disk-resident
(or memory-resident) datasets.
NetCDF access has been implemented in about half of Unidata’s software, so far, and it
is planned that such commonality will extend across all Unidata applications in order to:
• Facilitate the use of common datasets by distinct applications.
• Permit datasets to be transported between or shared by dissimilar computers trans-
parently, i.e., without translation.
• Reduce the programming effort usually spent interpreting formats.
• Reduce errors arising from misinterpreting data and ancillary data.
• Facilitate using output from one application as input to another.
• Establish an interface standard which simplifies the inclusion of new software into the
Unidata system.
A measure of success has been achieved. NetCDF is now in use on computing platforms
that range from personal computers to supercomputers and include most UNIX-based work-
stations. It can be used to create a complex dataset on one computer (say in FORTRAN)
and retrieve that same self-describing dataset on another computer (say in C) without in-
termediate translations–netCDF datasets can be transferred across a network, or they can
be accessed remotely using a suitable network file system or remote access protocols.
Because we believe that the use of netCDF access in non-Unidata software will ben-
efit Unidata’s primary constituency–such use may result in more options for analyzing
and displaying Unidata information–the netCDF library is distributed without licensing or
2 The NetCDF Users’ Guide
other significant restrictions, and current versions can be obtained via anonymous FTP.
Apparently the software has been well received by a wide range of institutions beyond the
atmospheric science community, and a substantial number of public domain and commercial
data analysis systems can now accept netCDF datasets as input.
Several organizations have adopted netCDF as a data access standard, and there is an
effort underway at the National Center for Supercomputer Applications (NCSA, which is
associated with the University of Illinois at Urbana-Champaign) to support the netCDF
programming interfaces as a means to store and retrieve data in "HDF files," i.e., in the
format used by the popular NCSA tools. We have encouraged and cooperated with these
efforts.
Questions occasionally arise about the level of support provided for the netCDF software.
Unidata’s formal position, stated in the copyright notice which accompanies the netCDF
library, is that the software is provided "as is". In practice, the software is updated from
time to time, and Unidata intends to continue making improvements for the foreseeable
future. Because Unidata’s mission is to serve geoscientists at U.S. universities, problems
reported by that community necessarily receive the greatest attention.
We hope the reader will find the software useful and will give us feedback on its appli-
cation as well as suggestions for its improvement.
David Fulker, 1996
Unidata Program Center Director, University Corporation for Atmospheric Research
Summary 3
Summary
The purpose of the Network Common Data Form (netCDF) interface is to allow you to
create, access, and share array-oriented data in a form that is self-describing and portable.
"Self-describing" means that a dataset includes information defining the data it contains.
"Portable" means that the data in a dataset is represented in a form that can be accessed by
computers with different ways of storing integers, characters, and floating-point numbers.
Using the netCDF interface for creating new datasets makes the data portable. Using the
netCDF interface in software for data access, management, analysis, and display can make
the software more generally useful.
The netCDF software includes C, Fortran 77, Fortran 90, and C++ interfaces for accessing
netCDF data. These libraries are available for many common computing platforms.
The community of netCDF users has contributed ports of the software to additional
platforms and interfaces for other programming languages as well. Source code for netCDF
software libraries is freely available to encourage the sharing of both array-oriented data
and the software that makes the data useful.
This User’s Guide presents the netCDF data model. It explains how the netCDF data
model uses dimensions, variables, and attributes to store data. Language specific program-
ming guides are available for C (see Section “Top” in The NetCDF C Interface Guide), C++
(see Section “Top” in The NetCDF C++ Interface Guide), Fortran 77 (see Section “Top”
in The NetCDF Fortran 77 Interface Guide), and Fortran 90 (see Section “Top” in The
NetCDF Fortran 90 Interface Guide).
Reference documentation for UNIX systems, in the form of UNIX ’man’ pages
for the C and FORTRAN interfaces is also available at the netCDF web site
(http://www.unidata.ucar.edu/netcdf), and with the netCDF distribution.
The latest version of this document, and the language specific guides, can be found at
the netCDF web site, http://www.unidata.ucar.edu/netcdf/docs, along with extensive
additional information about netCDF, including pointers to other software that works with
netCDF data.
Separate documentation of the Java netCDF library can be found at
http://www.unidata.ucar.edu/software/netcdf-java.
For installation and porting information See Section “Top” in The NetCDF Installation
and Porting Guide.
Chapter 1: Introduction 5
1 Introduction
Related to this is a second problem with general-purpose database systems: their poor
performance on large arrays. Collections of satellite images, scientific model outputs and
long-term global weather observations are beyond the capabilities of most database systems
to organize and index for efficient retrieval.
Finally, general-purpose database systems provide, at significant cost in terms of both
resources and access performance, many facilities that are not needed in the analysis, man-
agement, and display of array-oriented data. For example, elaborate update facilities, audit
trails, report formatting, and mechanisms designed for transaction-processing are unneces-
sary for most scientific applications.
cannot read 64-bit offset format files, and library versions before 4.0 can’t read netCDF-
4/HDF5 files. NetCDF classic format files (even if created by version 3.6.0 or later) remain
compatible with older versions of the netCDF library.
Users are encouraged to use netCDF classic format to distribute data, for maximum
portability.
To select 64-bit offset or netCDF-4 format files, C programmers should use flag
NC 64BIT OFFSET or NC NETCDF4 in function nc create. See Section “nc create” in
The NetCDF C Interface Guide.
In Fortran, use flag nf 64bit offset or nf format netcdf4 in function NF CREATE. See
Section “NF CREATE” in The NetCDF Fortran 77 Interface Guide.
It is also possible to change the default creation format, to convert a large
body of code without changing every create call. C programmers see Section
“nc set default format” in The NetCDF C Interface Guide. Fortran programs see Section
“NF SET DEFAULT FORMAT” in The NetCDF Fortran 77 Interface Guide.
the desired generality was not practical. NASA’s CDF and Unidata’s netCDF have since
evolved separately, but recent CDF versions share many characteristics with netCDF.
In early 1988, Joe Fahle of SeaSpace, Inc. (a commercial software development firm in
San Diego, California), a participant in the 1987 Unidata CDF workshop, independently
developed a CDF package in C that extended the NASA CDF interface in several important
ways (Fahle, 1989). Like Raymond’s package, the SeaSpace CDF software permitted vari-
ables with unrelated shapes to be included in the same data object and permitted a general
form of access to multidimensional arrays. Fahle’s implementation was used at SeaSpace
as the intermediate form of storage for a variety of steps in their image-processing system.
This interface and format have subsequently evolved into the Terascan data format.
After studying Fahle’s interface, we concluded that it solved many of the problems we
had identified in trying to stretch the NASA interface to our purposes. In August 1988, we
convened a small workshop to agree on a Unidata netCDF interface, and to resolve remaining
open issues. Attending were Joe Fahle of SeaSpace, Michael Gough of Apple (an author
of the NASA CDF software), Angel Li of the University of Miami (who had implemented
our prototype netCDF software on VMS and was a potential user), and Unidata systems
development staff. Consensus was reached at the workshop after some further simplifications
were discovered. A document incorporating the results of the workshop into a proposed
Unidata netCDF interface specification was distributed widely for comments before Glenn
Davis and Russ Rew implemented the first version of the software. Comparison with other
data-access interfaces and experience using netCDF are discussed in Rew and Davis (1990a),
Rew and Davis (1990b), Jenter and Signell (1992), and Brown, Folk, Goucher, and Rew
(1993).
In October 1991, we announced version 2.0 of the netCDF software distribution. Slight
modifications to the C interface (declaring dimension lengths to be long rather than int)
improved the usability of netCDF on inexpensive platforms such as MS-DOS computers,
without requiring recompilation on other platforms. This change to the interface required
no changes to the associated file format.
Release of netCDF version 2.3 in June 1993 preserved the same file format but added sin-
gle call access to records, optimizations for accessing cross-sections involving non-contiguous
data, subsampling along specified dimensions (using ’strides’), accessing non-contiguous
data (using ’mapped array sections’), improvements to the ncdump and ncgen utilities, and
an experimental C++ interface.
In version 2.4, released in February 1996, support was added for new platforms and for
the C++ interface, significant optimizations were implemented for supercomputer architec-
tures, and the file format was formally specified in an appendix to the User’s Guide.
FAN (File Array Notation), software providing a high-level interface to netCDF data,
was made available in May 1996. The capabilities of the FAN utilities include extracting
and manipulating array data from netCDF datasets, printing selected data from netCDF
arrays, copying ASCII data into netCDF arrays, and performing various operations (sum,
mean, max, min, product, and others) on netCDF arrays.
In 1996 and 1997, Joe Sirott implemented and made available the first implementation
of a read-only netCDF interface for Java, Bill Noon made a Python module available for
netCDF, and Konrad Hinsen contributed another netCDF interface for Python.
Chapter 1: Introduction 11
In May 1997, Version 3.3 of netCDF was released. This included a new type-safe interface
for C and Fortran, as well as many other improvements. A month later, Charlie Zender
released version 1.0 of the NCO (netCDF Operators) package, providing command-line
utilities for general purpose operations on netCDF data.
Version 3.4 of Unidata’s netCDF software, released in March 1998, included initial large
file support, performance enhancements, and improved Cray platform support. Later in
1998, Dan Schmitt provided a Tcl/Tk interface, and Glenn Davis provided version 1.0 of
netCDF for Java.
In May 1999, Glenn Davis, who was instrumental in creating and developing netCDF,
died in a small plane crash during a thunderstorm. The memory of Glenn’s passions and
intellect continue to inspire those of us who worked with him.
In February 2000, an experimental Fortran 90 interface developed by Robert Pincus was
released.
John Caron released netCDF for Java, version 2.0 in February 2001. This version in-
corporated a new high-performance package for multidimensional arrays, simplified the
interface, and included OpenDAP (known previously as DODS) remote access, as well as
remote netCDF access via HTTP contributed by Don Denbo.
In March 2001, NetCDF 3.5.0 was released. This release fully integrated the new Fortran
90 interface, enhanced portability, improved the C++ interface, and added a few new tuning
functions.
Also in 2001, Takeshi Horinouchi and colleagues made a netCDF interface for Ruby
available, as did David Pierce for the R language for statistical computing and graphics.
Charles Denham released WetCDF, an independent implementation of the netCDF interface
for Matlab, as well as updates to the popular netCDF Toolbox for Matlab.
In 2002, Unidata and collaborators developed NcML, an XML representation for netCDF
data useful for cataloging data holdings, aggregation of data from multiple datasets, aug-
menting metadata in existing datasets, and support for alternative views of data. The Java
interface currently provides access to netCDF data through NcML.
Additional developments in 2002 included translation of C and Fortran User Guides
into Japanese by Masato Shiotani and colleagues, creation of a “Best Practices” guide for
writing netCDF files, and provision of an Ada-95 interface by Alexandru Corlan.
In July 2003 a group of researchers at Northwestern University and Argonne National
Laboratory (Jianwei Li, Wei-keng Liao, Alok Choudhary, Robert Ross, Rajeev Thakur,
William Gropp, and Rob Latham) contributed a new parallel interface for writing and
reading netCDF data, tailored for use on high performance platforms with parallel I/O. The
implementation built on the MPI-IO interface, providing portability to many platforms.
In October 2003, Greg Sjaardema contributed support for an alternative format with
64-bit offsets, to provide more complete support for very large files. These changes, with
slight modifications at Unidata, were incorporated into version 3.6.0, released in December,
2004.
In 2004, thanks to a NASA grant, Unidata and NCSA began a collaboration to increase
the interoperability of netCDF and HDF5, and bring some advanced HDF5 features to
netCDF users.
In February, 2006, release 3.6.1 fixed some minor bugs.
12 The NetCDF Users’ Guide
In March, 2007, release 3.6.2 introduced an improved build system that used automake
and libtool, and an upgrade to the most recent autoconf release, to support shared libraries
and the netcdf-4 builds. This release also introduced the NetCDF Tutorial and example
programs.
The first beta release of netCDF-4.0 was celebrated with a giant party at Unidata in
April, 2007. Over 2000 people danced ’til dawn at the NCAR Mesa Lab, listening to the
Flaming Lips and the Denver Gilbert & Sullivan repertory company.
In June, 2008, netCDF-4.0 was released. Version 3.6.3, the same code but with netcdf-4
features turned off, was released at the same time. The 4.0 release uses HDF5 1.8.1 as the
data storage layer for netcdf, and introduces many new features including groups and user-
defined types. The 3.6.3/4.0 releases also introduced handling of UTF8-encoded Unicode
names.
Gbyte, which is 1,000,000,000 bytes.) of data in a single netCDF dataset. (see Section 4.6
[Classic Limitations], page 38). This limitation is a result of 32-bit offsets used for storing
relative offsets within a classic netCDF format file. Since one of the goals of netCDF is
portable data and some computing platforms still can’t deal with files larger than 2 GiB,
it is best to keep files that must be portable below this limit. Nevertheless, it is possible
to create and access netCDF files larger than 2 GiB on platforms that provide support for
such files (see Section 4.4 [Large File Support], page 37).
The new 64-bit offset format allows large files, and makes it easy to create to create fixed
variables of about 4 GiB, and record variables of about 4 GiB per record. (see Section 4.5
[64 bit Offset Limitations], page 38). However, old netCDF applications will not be able to
read the 64-bit offset files until they are upgraded to at least version 3.6.0 of netCDF (i.e.
the version in which 64-bit offset format was introduced).
With the netCDF-4/HDF5 format size limitations are further relaxed, and files can be
as large as the underlying file system supports. NetCDF-4/HDF5 files are unreadable to
the netCDF library before version 4.0.
Another limitation of the classic (and 64-bit offset) model is that only one unlimited
(changeable) dimension is permitted for each netCDF data set. Multiple variables can share
an unlimited dimension, but then they must all grow together. Hence the classic netCDF
model does not permit variables with several unlimited dimensions or the use of multiple
unlimited dimensions in different variables within the same dataset. Variables that have
non-rectangular shapes (for example, ragged arrays) cannot be represented conveniently.
In netCDF-4/HDF5 files, multiple unlimited dimensions are fully supported. Any vari-
able can be defined with any combination of limited and unlimited dimensions.
The extent to which data can be completely self-describing is limited: there is always
some assumed context without which sharing and archiving data would be impractical.
NetCDF permits storing meaningful names for variables, dimensions, and attributes; units
of measure in a form that can be used in computations; text strings for attribute values that
apply to an entire data set; and simple kinds of coordinate system information. But for
more complex kinds of metadata (for example, the information necessary to provide accurate
georeferencing of data on unusual grids or from satellite images), it is often necessary to
develop conventions.
Specific additions to the netCDF data model might make some of these conventions
unnecessary or allow some forms of metadata to be represented in a uniform and compact
way. For example, adding explicit georeferencing to the netCDF data model would simplify
elaborate georeferencing conventions at the cost of complicating the model. The problem
is finding an appropriate trade-off between the richness of the model and its generality
(i.e., its ability to encompass many kinds of data). A data model tailored to capture the
shared context among researchers within one discipline may not be appropriate for sharing
or combining data from multiple disciplines.
The classic netCDF data model does not support nested data structures such as trees,
nested arrays, or other recursive structures. (This limitation also applies to 64-bit offset
files.) Through use of indirection and conventions it is possible to represent some kinds of
nested structures, but the result may fall short of the netCDF goal of self-describing data.
In netCDF-4/HDF5 format files, the introduction of the compound type allows the
creation of complex data types, involving any combination of types. The VLEN type allows
14 The NetCDF Users’ Guide
efficient storage of ragged arrays, and the introduction of hierarchical groups allows users
to organize data.
Finally, for classic and 64-bit offset files, concurrent access to a netCDF dataset is limited.
One writer and multiple readers may access data in a single dataset simultaneously, but
there is no support for multiple concurrent writers.
NetCDF-4 supports parallel read/write access to netCDF-4/HDF5 files, using the un-
derlying HDF5 library.
For more information about HDF5, see the HDF5 web site: http://hdfgroup.org/HDF5/.
1.12 References
1. Brown, S. A, M. Folk, G. Goucher, and R. Rew, "Software for Portable Scientific Data
Management," Computers in Physics, American Institute of Physics, Vol. 7, No. 3,
May/June 1993.
2. Davies, H. L., "FAN - An array-oriented query language," Second Workshop on Data-
base Issues for Data Visualization (Visualization 1995), Atlanta, Georgia, IEEE, Octo-
ber 1995.
3. Fahle, J., TeraScan Applications Programming Interface, SeaSpace, San Diego, Cali-
fornia, 1989.
4. Fulker, D. W., "The netCDF: Self-Describing, Portable Files—a Basis for
’Plug-Compatible’ Software Modules Connectable by Networks," ICSU Workshop on
Geophysical Informatics, Moscow, USSR, August 1988.
5. Fulker, D. W., "Unidata Strawman for Storing Earth-Referencing Data," Seventh Inter-
national Conference on Interactive Information and Processing Systems for Meteorol-
ogy, Oceanography, and Hydrology, New Orleans, La., American Meteorology Society,
January 1991.
6. Gough, M. L., NSSDC CDF Implementer’s Guide (DEC VAX/VMS) Version 1.1, Na-
tional Space Science Data Center, 88-17, NASA/Goddard Space Flight Center, 1988.
7. Jenter, H. L. and R. P. Signell, "NetCDF: A Freely-Available Software-Solution to
Data-Access Problems for Numerical Modelers," Proceedings of the American Society
of Civil Engineers Conference on Estuarine and Coastal Modeling, Tampa, Florida,
1992.
8. Raymond, D. J., "A C Language-Based Modular System for Analyzing and Displaying
Gridded Numerical Data," Journal of Atmospheric and Oceanic Technology, 5, 501-511,
1988.
9. Rew, R. K. and G. P. Davis, "The Unidata netCDF: Software for Scientific Data Ac-
cess," Sixth International Conference on Interactive Information and Processing Sys-
tems for Meteorology, Oceanography, and Hydrology, Anaheim, California, American
Meteorology Society, February 1990.
Chapter 1: Introduction 15
10. Rew, R. K. and G. P. Davis, "NetCDF: An Interface for Scientific Data Access,"
Computer Graphics and Applications, IEEE, pp. 76-82, July 1990.
11. Rew, R. K. and G. P. Davis, "Unidata’s netCDF Interface for Data Access: Status
and Plans," Thirteenth International Conference on Interactive Information and Pro-
cessing Systems for Meteorology, Oceanography, and Hydrology, Anaheim, California,
American Meteorology Society, February 1997.
12. Treinish, L. A. and M. L. Gough, "A Software Package for the Data Independent
Management of Multi-Dimensional Data," EOS Transactions, American Geophysical
Union, 68, 633-635, 1987.
Chapter 2: Components of a NetCDF Dataset 17
a convenient way of describing netCDF datasets. The netCDF system includes the ncdump
utility for producing human-oriented CDL text files from binary netCDF datasets and vice
versa. (The ncdump utility has recently been enhanced to accommodate netCDF-4 features
in the CDL output, but the example here is restricted to netCDF-3 CDL.)
The CDL notation for a netCDF dataset can be generated automatically by using nc-
dump, a utility program described later (see Section 5.5 [ncdump], page 55). Another
netCDF utility, ncgen, generates a netCDF dataset (or optionally C or FORTRAN source
code containing calls needed to produce a netCDF dataset) from CDL input (see Section 5.4
[ncgen], page 53). The ncgen4 program is similar to ncgen, but can produce netcdf-4 files
and can utilize CDL input that includes the netcdf-4 data model constructs.
The CDL notation is simple and largely self-explanatory. It will be explained more fully
as we describe the components of a netCDF dataset. For now, note that CDL statements
are terminated by a semicolon. Spaces, tabs, and newlines can be used freely for readability.
Chapter 2: Components of a NetCDF Dataset 19
Comments in CDL follow the characters ’//’ on any line. A CDL description of a netCDF
dataset takes the form
netCDF name {
dimensions: ...
variables: ...
data: ...
}
where the name is used only as a default in constructing file names by the ncgen utility.
The CDL description consists of three optional parts, introduced by the keywords dimen-
sions, variables, and data. NetCDF dimension declarations appear after the dimensions
keyword, netCDF variables and attributes are defined after the variables keyword, and
variable data assignments appear after the data keyword.
The ncgen utility provides a command line option which indicates the desired output
format. Limitations are enforced for the selected format - that is, some CDL files may be
expressible only in 64-bit offset or NetCDF-4 format.
For example, trying to create a file with very large variables in classic format may result
in an error because size limits are violated.
2.2 Dimensions
A dimension may be used to represent a real physical dimension, for example, time, latitude,
longitude, or height. A dimension might also be used to index other quantities, for example
station or model-run-number.
A netCDF dimension has both a name and a length.
A dimension length is an arbitrary positive integer, except that one dimension in a classic
or 64-bit offset netCDF dataset can have the length UNLIMITED. In a netCDF-4 dataset,
any number of unlimited dimensions can be used.
Such a dimension is called the unlimited dimension or the record dimension. A variable
with an unlimited dimension can grow to any length along that dimension. The unlimited
dimension index is like a record number in conventional record-oriented files.
A netCDF classic or 64-bit offset dataset can have at most one unlimited dimension, but
need not have any. If a variable has an unlimited dimension, that dimension must be the
most significant (slowest changing) one. Thus any unlimited dimension must be the first
dimension in a CDL shape and the first dimension in corresponding C array declarations.
A netCDF-4 dataset may have multiple unlimited dimensions, and there are no restric-
tions on their order in the list of a variables dimensions.
To grow variables along an unlimited dimension, write the data using any of the netCDF
data writing functions, and specify the index of the unlimited dimension to the desired
record number. The netCDF library will write however many records are needed (using the
fill value, unless that feature is turned off, to fill in any intervening records).
CDL dimension declarations may appear on one or more lines following the CDL keyword
dimensions. Multiple dimension declarations on the same line may be separated by commas.
Each declaration is of the form name = length. Use the “/” character to include group
information (netCDF-4 output only).
20 The NetCDF Users’ Guide
There are four dimensions in the above example: lat, lon, level, and time (see Section 2.1
[Data Model], page 17). The first three are assigned fixed lengths; time is assigned the length
UNLIMITED, which means it is the unlimited dimension.
The basic unit of named data in a netCDF dataset is a variable. When a variable is
defined, its shape is specified as a list of dimensions. These dimensions must already exist.
The number of dimensions is called the rank (a.k.a. dimensionality). A scalar variable has
rank 0, a vector has rank 1 and a matrix has rank 2.
It is possible (since version 3.1 of netCDF) to use the same dimension more than once
in specifying a variable shape. For example, correlation(instrument, instrument) could
be a matrix giving correlations between measurements using different instruments. But
data whose dimensions correspond to those of physical space/time should have a shape
comprising different dimensions, even if some of these have the same length.
2.3 Variables
Variables are used to store the bulk of the data in a netCDF dataset. A variable represents
an array of values of the same type. A scalar value is treated as a 0-dimensional array. A
variable has a name, a data type, and a shape described by its list of dimensions specified
when the variable is created. A variable may also have associated attributes, which may be
added, deleted or changed after the variable is created.
A variable external data type is one of a small set of netCDF types. In classic and 64-bit
offset files, only the original six types are available (byte, character, short, int, float, and
double). Variables in netCDF-4 files may also use unsigned short, unsigned int, 64-bit int,
unsigned 64-bit int, or string. Or the user may define a type, as an opaque blob of bytes,
as an array of variable length arrays, or as a compound type, which acts like a C struct.
For more information on types for the C interface, see Section “Variable Types” in The
NetCDF C Interface Guide in The NetCDF C Interface Guide.
For more information on types for the Fortran interface, see Section “Variable Types”
in The NetCDF Fortran 77 Interface Guide in The NetCDF Fortran 77 Interface Guide.
In the CDL notation, classic and 64-bit offset type can be used. They are given the
simpler names byte, char, short, int, float, and double. real may be used as a synonym for
float in the CDL notation. long is a deprecated synonym for int. For the exact meaning
of each of the types see Section 3.1 [External Types], page 25. The ncgen4 utility supports
new primitive types with names ubyte, ushort, uint, int64, uint64, and string.
CDL variable declarations appear after the variable keyword in a CDL unit. They have
the form
type variable_name ( dim_name_1, dim_name_2, ... );
for variables with dimensions, or
type variable_name;
for scalar variables.
In the above CDL example there are six variables. As discussed below, four of these
are coordinate variables. The remaining variables (sometimes called primary variables),
temp and rh, contain what is usually thought of as the data. Each of these variables has
the unlimited dimension time as its first dimension, so they are called record variables. A
variable that is not a record variable has a fixed length (number of data values) given by
Chapter 2: Components of a NetCDF Dataset 21
the product of its dimension lengths. The length of a record variable is also the product of
its dimension lengths, but in this case the product is variable because it involves the length
of the unlimited dimension, which can vary. The length of the unlimited dimension is the
number of records.
2.4 Attributes
NetCDF attributes are used to store data about the data (ancillary data or metadata), sim-
ilar in many ways to the information stored in data dictionaries and schema in conventional
database systems. Most attributes provide information about a specific variable. These are
identified by the name (or ID) of that variable, together with the name of the attribute.
Some attributes provide information about the dataset as a whole and are called global
attributes. These are identified by the attribute name together with a blank variable name
(in CDL) or a special null "global variable" ID (in C or Fortran).
In netCDF-4 file, attributes can also be added at the group level.
22 The NetCDF Users’ Guide
An attribute has an associated variable (the null "global variable" for a global or group-
level attribute), a name, a data type, a length, and a value. The current version treats all
attributes as vectors; scalar values are treated as single-element vectors.
Conventional attribute names should be used where applicable. New names should be
as meaningful as possible.
The external type of an attribute is specified when it is created. The types permitted for
attributes are the same as the netCDF external data types for variables. Attributes with
the same name for different variables should sometimes be of different types. For example,
the attribute valid max specifying the maximum valid data value for a variable of type int
should be of type int, whereas the attribute valid max for a variable of type double should
instead be of type double.
Attributes are more dynamic than variables or dimensions; they can be deleted and have
their type, length, and values changed after they are created, whereas the netCDF interface
provides no way to delete a variable or to change its type or shape.
The CDL notation for defining an attribute is
variable_name:attribute_name = list_of_values;
for a variable attribute, or
:attribute_name = list_of_values;
for a global attribute. For a group level attribute (netCDF-4 files only):
:group_name/subgroup_name/attribute_name = list_of_values;
Groups will be created as needed to store the attributes.
For the netCDF classic model, the type and length of each attribute are not explicitly
declared in CDL; they are derived from the values assigned to the attribute. All values of
an attribute must be of the same type. The notation used for constant values of the various
netCDF types is discussed later (see Section 5.3 [CDL Constants], page 52). The extended
CDL syntax for the enhanced data model supported by netCDF-4 requires type declarations
for attributes of user-defined types. See ncdump output or the reference documentation for
ncgen4 for details of the extended CDL systax.
In the netCDF example (see Section 2.1 [Data Model], page 17), units is an attribute
for the variable lat that has a 13-character array value ’degrees north’. And valid range is
an attribute for the variable rh that has length 2 and values ’0.0’ and ’1.0’.
One global attribute, called “source”, is defined for the example netCDF dataset. This is
a character array intended for documenting the data. Actual netCDF datasets might have
more global attributes to document the origin, history, conventions, and other characteristics
of the dataset as a whole.
Most generic applications that process netCDF datasets assume standard attribute
conventions and it is strongly recommended that these be followed unless there are good
reasons for not doing so. For information about units, long name, valid min, valid max,
valid range, scale factor, add offset, FillValue, and other conventional attributes, see
Appendix B [Attribute Conventions], page 61.
Attributes may be added to a netCDF dataset long after it is first defined, so you don’t
have to anticipate all potentially useful attributes. However adding new attributes to an
existing classic or 64-bit offset format dataset can incur the same expense as copying the
dataset. For a more extensive discussion see Chapter 4 [Structure], page 35.
Chapter 2: Components of a NetCDF Dataset 23
3 Data
This chapter discusses the primitive netCDF external data types, the kinds of data access
supported by the netCDF interface, and how data structures other than arrays may be
implemented in a netCDF dataset.
* These types are available only for netCDF-4 format files. All the unsigned ints (except
NC CHAR), the 64-bit ints, and string type are for netCDF-4 files only.
These types were chosen to provide a reasonably wide range of trade-offs between data
precision and number of bits required for each value. These external data types are in-
dependent from whatever internal data types are supported by a particular machine and
language combination.
These types are called "external", because they correspond to the portable external rep-
resentation for netCDF data. When a program reads external netCDF data into an internal
variable, the data is converted, if necessary, into the specified internal type. Similarly, if you
write internal data into a netCDF variable, this may cause it to be converted to a different
external type, if the external type for the netCDF variable differs from the internal type.
The separation of external and internal types and automatic type conversion have several
advantages. You need not be aware of the external type of numeric variables, since automatic
26 The NetCDF Users’ Guide
conversion to or from any desired numeric type is available. You can use this feature to
simplify code, by making it independent of external types, using a sufficiently wide internal
type, e.g., double precision, for numeric netCDF data of several different external types.
Programs need not be changed to accommodate a change to the external type of a variable.
If conversion to or from an external numeric type is necessary, it is handled by the library.
Converting from one numeric type to another may result in an error if the target type is
not capable of representing the converted value. For example, an internal short integer type
may not be able to hold data stored externally as an integer. When accessing an array of
values, a range error is returned if one or more values are out of the range of representable
values, but other values are converted properly.
Note that mere loss of precision in type conversion does not return an error. Thus, if you
read double precision values into a single-precision floating-point variable, for example, no
error results unless the magnitude of the double precision value exceeds the representable
range of single-precision floating point numbers on your platform. Similarly, if you read a
large integer into a float incapable of representing all the bits of the integer in its mantissa,
this loss of precision will not result in an error. If you want to avoid such precision loss,
check the external types of the variables you access to make sure you use an internal type
that has adequate precision.
The names for the primitive external data types (byte, char, short, ushort, int, uint,
int64, uint64, float or real, double, string) are reserved words in CDL, so the names of
variables, dimensions, and attributes must not be type names.
It is possible to interpret byte data as either signed (-128 to 127) or unsigned (0 to 255).
However, when reading byte data to be converted into other numeric types, it is interpreted
as signed.
For the correspondence between netCDF external data types and the data types of a
language see Section 2.3 [Variables], page 20.
int row_start(max_rows);
data:
row_start = 0, 12, 19, ...
As another example, netCDF variables may be grouped within a netCDF classic or 64-
bit offset dataset by defining attributes that list the names of the variables in each group,
separated by a conventional delimiter such as a space or comma. Using a naming convention
for attribute names for such groupings permits any number of named groups of variables.
A particular conventional attribute for each variable might list the names of the groups
of which it is a member. Use of attributes, or variables that refer to other attributes or
variables, provides a flexible mechanism for representing some kinds of complex structures
in netCDF datasets.
and nc get var[1asm] calls to read them. (For example, the nc put varm function will write
mapped arrays of these structs.)
While structs, in general, are not portable from platform to platform, the HDF5 layer
(when installed) performs the magic required to figure out your platform’s idiosyncrasies,
and adjust to them. The end result is that HDF5 compound types (and therefore, netCDF-4
compound types), are portable.
For more information on creating and using compound types, see Section “Compound
Types” in The NetCDF C Interface Guide in The NetCDF C Interface Guide.
3.3.5 Groups
Although not a type of data, groups can help organize data within a dataset. Like a directory
structure on a Unix file-system, the grouping feature allows users to organize variables and
dimensions into distinct, named, hierarchical areas, called groups. For more information on
groups types, see Section “Groups” in The NetCDF C Interface Guide in The NetCDF C
Interface Guide.
the access function corresponds to the internal type of the data. If the internal type has
a different representation from the external type of the variable, a conversion between the
internal type and external type will take place when the data is read or written.
Access to data in classic and 64-bit offset format is direct. Access to netCDF-4 data is
buffered by the HDF5 layer. In either case you can access a small subset of data from a
large dataset efficiently, without first accessing all the data that precedes it.
Reading and writing data by specifying a variable, instead of a position in a file, makes
data access independent of how many other variables are in the dataset, making programs
immune to data format changes that involve adding more variables to the data.
In the C and FORTRAN interfaces, datasets are not specified by name every time you
want to access data, but instead by a small integer called a dataset ID, obtained when the
dataset is first created or opened.
Similarly, a variable is not specified by name for every data access either, but by a
variable ID, a small integer used to identify each variable in a netCDF dataset.
The use of mapped array sections is discussed more fully below, but first we present an
example of the more commonly used array-section access.
...
float temp[TIMES][LEVELS][LATS][LONS];
using a multidimensional array declaration.
To specify the block of data that represents just the second level, all times, all latitudes,
and all longitudes, we need to provide a start index and some edge lengths. The start index
should be (0, 1, 0, 0) in C, because we want to start at the beginning of each of the time,
lon, and lat dimensions, but we want to begin at the second value of the level dimension.
The edge lengths should be (3, 1, 5, 10) in C, (since we want to get data for all three time
values, only one level value, all five lat values, and all 10 lon values. We should expect to
get a total of 150 floating-point values returned (3 * 1 * 5 * 10), and should provide enough
space in our array for this many. The order in which the data will be returned is with the
last dimension, lon, varying fastest:
temp[0][1][0][0]
temp[0][1][0][1]
temp[0][1][0][2]
temp[0][1][0][3]
...
temp[2][1][4][7]
temp[2][1][4][8]
temp[2][1][4][9]
Chapter 3: Data 31
Different dimension orders for the C, FORTRAN, or other language interfaces do not
reflect a different order for values stored on the disk, but merely different orders supported
by the procedural interfaces to the languages. In general, it does not matter whether a
netCDF dataset is written using the C, FORTRAN, or another language interface; netCDF
datasets written from any supported language may be read by programs written in other
supported languages.
Note that, although the netCDF abstraction allows the use of subsampled or mapped
array-section access there use is not required. If you do not need these more general forms
of access, you may ignore these capabilities and use single value access or regular array
section access instead.
...
TEMP( 8, 5, 2, 3)
TEMP( 9, 5, 2, 3)
TEMP(10, 5, 2, 3)
Different dimension orders for the C, FORTRAN, or other language interfaces do not
reflect a different order for values stored on the disk, but merely different orders supported
by the procedural interfaces to the languages. In general, it does not matter whether a
netCDF dataset is written using the C, FORTRAN, or another language interface; netCDF
Chapter 3: Data 33
datasets written from any supported language may be read by programs written in other
supported languages.
the netCDF variables are. On reading netCDF data, integers of various sizes and single-
precision floating-point values will all be converted to double-precision, if you use the data
access interface for double-precision values. Of course, you can avoid automatic numeric
conversion by using the netCDF interface for a value type that corresponds to the external
data type of each netCDF variable, where such value types exist.
The automatic numeric conversions performed by netCDF are easy to understand, be-
cause they behave just like assignment of data of one type to a variable of a different type.
For example, if you read floating-point netCDF data as integers, the result is truncated to-
wards zero, just as it would be if you assigned a floating-point value to an integer variable.
Such truncation is an example of the loss of precision that can occur in numeric conversions.
Converting from one numeric type to another may result in an error if the target type is
not capable of representing the converted value. For example, an integer may not be able to
hold data stored externally as an IEEE floating-point number. When accessing an array of
values, a range error is returned if one or more values are out of the range of representable
values, but other values are converted properly.
Note that mere loss of precision in type conversion does not result in an error. For
example, if you read double precision values into an integer, no error results unless the
magnitude of the double precision value exceeds the representable range of integers on your
platform. Similarly, if you read a large integer into a float incapable of representing all the
bits of the integer in its mantissa, this loss of precision will not result in an error. If you
want to avoid such precision loss, check the external types of the variables you access to
make sure you use an internal type that has a compatible precision.
Whether a range error occurs in writing a large floating-point value near the boundary
of representable values may be depend on the platform. The largest floating-point value
you can write to a netCDF float variable is the largest floating-point number representable
on your system that is less than 2 to the 128th power. The largest double precision value
you can write to a double variable is the largest double-precision number representable on
your system that is less than 2 to the 1024th power.
Chapter 4: File Structure and Performance 35
before writing data, and avoid later additions and renamings of netCDF components that
require more space in the header part of the file, you avoid the cost associated with later
changing the header.
Alternatively, you can use an alternative version of the enddef function with two underbar
characters instead of one to explicitly reserve extra space in the file header when the file is
created: in C nc enddef (see Section “nc enddef” in The NetCDF C Interface Guide), in
Fortran NF ENDDEF (see Section “NF ENDDEF” in The NetCDF Fortran 77 Interface
Guide), after a previous call to the redef function. This avoids the expense of moving all
the data later by reserving enough extra space in the header to accommodate anticipated
changes, such as the addition of new attributes or the extension of existing string attributes
to hold longer strings.
When the size of the header is changed, data in the file is moved, and the location of
data values in the file changes. If another program is reading the netCDF dataset during
redefinition, its view of the file will be based on old, probably incorrect indexes. If netCDF
datasets are shared across redefinition, some mechanism external to the netCDF library
must be provided that prevents access by readers during redefinition, and causes the readers
to call nc sync/NF SYNC before any subsequent access.
The fixed-size data part that follows the header contains all the variable data for vari-
ables that do not employ an unlimited dimension. The data for each variable is stored
contiguously in this part of the file. If there is no unlimited dimension, this is the last part
of the netCDF file.
The record-data part that follows the fixed-size data consists of a variable number of
fixed-size records, each of which contains data for all the record variables. The record data
for each variable is stored contiguously in each record.
The order in which the variable data appears in each data section is the same as the
order in which the variables were defined, in increasing numerical order by netCDF variable
ID. This knowledge can sometimes be used to enhance data access performance, since the
best data access is currently achieved by reading or writing the data in sequential order.
For more detail see Appendix C [File Format], page 65.
netcdf bigfile1 {
dimensions:
x=2000;
y=5000;
z=10000;
variables:
double x(x); // coordinate variables
double y(y);
double z(z);
double var(x, y, z); // 800 Gbytes
}
If you use the unlimited dimension, record variables may exceed 2 GiB in size, as long
as the offset of the start of each record variable within a record is less than 2 GiB - 4. For
example, the structure of the data in a 2.4 Tbyte file might be something like:
netcdf bigfile2 {
dimensions:
x=2000;
y=5000;
z=10;
t=UNLIMITED; // 1000 records, for example
variables:
double x(x); // coordinate variables
double y(y);
double z(z);
double t(t);
// 3 record variables, 2400000000 bytes per record
double var1(t, x, y, z);
double var2(t, x, y, z);
double var3(t, x, y, z);
}
changes up-to-date (for example, changes to attribute values). Opening the file with the
NC SHARE (in C) or the NF SHARE (in Fortran) is analogous to setting a stdio stream
to be unbuffered with the IONBF flag to setvbuf.
As in the stdio library, flushes are also performed when "seeks" occur to a different area
of the file. Hence the order of read and write operations can influence I/O performance
significantly. Reading data in the same order in which it was written within each record
will minimize buffer flushes.
You should not expect netCDF classic or 64-bit offset format data access to work with
multiple writers having the same file open for writing simultaneously.
It is possible to tune an implementation of netCDF for some platforms by replacing
the I/O layer with a different platform-specific I/O layer. This may change the similari-
ties between netCDF and standard I/O, and hence characteristics related to data sharing,
buffering, and the cost of I/O operations.
The distributed netCDF implementation is meant to be portable. Platform-specific ports
that further optimize the implementation for better I/O performance are practical in some
cases.
1024-block buffers on the SSD. This scheme works well when accesses proceed
through the dataset in random waves roughly 2x1024-blocks wide.
All of the options/configurations supported in CRI’s FFIO library are available through
this mechanism. We recommend that you look at CRI’s I/O optimization guide for infor-
mation on using FFIO to its fullest. This mechanism is also compatible with CRI’s EIE
I/O library.
Tuning the NETCDF FFIOSPEC variable to a program’s I/O pattern can dramatically
improve performance. Speedups of two orders of magnitude have been seen.
Dimension scales are a new feature for HF 1.8, which allow specification of shared di-
mensions.
(In the future netCDF-4 will be able to deal with HDF5 files which do not have dimension
scales. However, this is not expected before netCDF 4.1.)
Finally, there is one feature which is missing from all current HDF5 releases, but which
will be in 1.8 - the ability to track object creation order. As you may know, netCDF keeps
track of the creation order of variables, dimensions, etc. HDF5 (currently) does not.
There is a bit of a hack in place in netCDF-4 files for this, but that hack will go away
when HDF5 1.8 comes out.
Without creation order, the files will still be readable to netCDF-4, it’s just that netCDF-
4 will number the variables in alphabetical, rather than creation, order.
Interoperability is a complex task, and all of this is in the alpha release stage. It is tested
in libsrc4/tst interops.c, which contains some examples of how to create HDF5 files, modify
them in netCDF-4, and then verify them in HDF5. (And vice versa).
It is possible to see what the translation does to a particular DAP data source in either of
two ways. First, one can examine the DDS source through a web browser and then examine
the translation using the ncdump -h command to see the netCDF Classic translation. The
ncdump output will actually be the union of the DDS with the DAS, so to see the complete
translation, it is necessary to view both.
For example, if a web browser is given the following, the first URL will return the DDS
for the specified dataset, and the second URL will return the DAS for the specified dataset.
http://test.opendap.org:8080/dods/dts/test.01.dds
http://test.opendap.org:8080/dods/dts/test.01.das
Then by using the following ncgen command, it is possible to see the equivalent netCDF
Classic translation.
ncgen -h http://test.opendap.org:8080/dods/dts/test.01
The DDS output from the web server should look like this.
Dataset {
Byte b;
Int32 i32;
UInt32 ui32;
Int16 i16;
UInt16 ui16;
Float32 f32;
Float64 f64;
String s;
Url u;
} SimpleTypes;
The DAS output from the web server should look like this.
Attributes {
Facility {
String PrincipleInvestigator ‘‘Mark Abbott’’, ‘‘Ph.D’’;
String DataCenter ‘‘COAS Environmental Computer Facility’’;
String DrifterType ‘‘MetOcean WOCE/OCM’’;
}
b {
String Description ‘‘A test byte’’;
String units ‘‘unknown’’;
}
i32 {
String Description ‘‘A 32 bit test server int’’;
String units ‘‘unknown’’;
}
}
The output from ncgen should look like this.
netcdf test {
dimensions:
unlimited = UNLIMITED ; // (0 currently)
stringdim64 = 64 ;
44 The NetCDF Users’ Guide
variables:
byte b ;
b:Description = "A test byte" ;
b:units = "unknown" ;
int i32 ;
i32:Description = "A 32 bit test server int" ;
i32:units = "unknown" ;
int ui32 ;
short i16 ;
short ui16 ;
float f32 ;
double f64 ;
char s(stringdim64) ;
char u(stringdim64) ;
}
Note that the fields of type String and type URL suddenly have have a dimension. This
is because strings are translated to arrays of char which requires adding an extra dimension.
The size of the dimension is determined in a variety of ways and can be specified. It defaults
to 64 and when read, the underlying string is either padded or truncated to that length.
Also note that the Facility attributes do not appear in the translation because they
are neither global nor associated with a variable in the DDS.
Alternately, one can get the DDS as a global attribute by using the client parameters
mechanism . In this case, the parameter “[show=dds]” can be prefixed to the URL and the
data retrieved using the following command
ncdump -h [show=dds]http://test.opendap.org:8080/dods/dts/test.01.dds
The ncdump -h command will then show both the translation and the original DDS. In
the above example, the DDS would appear as the global attribute “ DDS” as follows.
netcdf test {
...
variables:
:_DDS = "Dataset { Byte b; Int32 i32; UInt32 ui32; Int16 i16;
UInt16 ui16; Float32 f32; Float64 f64;
Strings; Url u; } SimpleTypes;"
byte b ;
...
}
2. For each unique anonymous dimension with value NN create a netCDF dimension of
the form “DIMNN=NN”.
3. For each unique named dimension “<name>=NN”, create a netCDF dimension of the
form “<name>=NN”.
4. At this point the only dimensions left to process should be named dimensions with the
same name as some dimension from step number 3, but with a different value. For
those dimensions create a dimension of the form “<name>NN=NN”.
5. Define a single UNLIMITED dimension.
Variable translation is a bit more complicated. Consider this OPeNDAP DDS.
Dataset {
Int32 f1;
Structure {
Int32 f11;
Structure {
Int32 f1[3];
Int32 f2;
} FS2[2];
} S1;
Structure {
Grid {
Array:
Float32 temp[lat=2][long=2];
Maps:
Int32 lat[lat=2];
Int32 long[long=2];
} G1;
} S2;
} D1;
Step 1: collect each primitive typed field in the DDS and rename to the qualified path
name (ignoring the Dataset name). Thus for the above DDS, the following fields would be
collected.
1. f1
2. S1.f11
3. S1.FS2.f1
4. S1.FS2.f2
5. S2.G1.temp
6. S2.G1.lat
7. S2.G1.long
Step 2: repeatedly remove the rightmost element of the qualified name except for the
final base name. This is called hoisting. If the resulting name conflicts with another name,
then un-hoist and do not process that name further. As an additional constraint, if one
name of a compound type (Structure, Grid, Sequence) conflicts, then make no attempt to
hoist any other field in that type.
Applying this rule leads to the following sequences of hoisting.
46 The NetCDF Users’ Guide
1. Step 1.
1. f1 - no hoisting is possible
2. hoist S1.f11 to f11
3. hoist S1.FS2.f1 to S1.f1
4. hoist S1.FS2.f2 to S1.f2
5. hoist S2.G1.temp to S1.temp
6. hoist S2.G1.lat to S1.lat
7. hoist S2.G1.long to S1.long
2. Step 2.
1. f1 - no hoisting is possible
2. f11 - no further hoisting is possble.
3. hoist S1.f1 to f1 - this causes a conflict with the existing f1 field.
4. S1.f2 - do not hoist because its sibling S1.f1 could not be hoisted.
5. hoist S1.temp to temp
6. hoist S1.lat to lat
7. hoist to S1.long to long
The final netCDF-3 schema is then as follows.
netcdf D1 {
dimensions:
lat=2; long=2; DIM3=3; DIM2=2;
variables:
int f1 ;
int f11 ;
int S1.f1(dim2,dim3) ;
int S1.f2(dim2) ;
int temp(lat,long) ;
int lat(lat) ;
int long(long) ;
}
The dimensions were created using the rules defined above. The variables are dimen-
sioned by the concatenation of their dimensions with the dimensions associated with any
containing Structure. Thus, since FS2 has dimension 2, the variable S1.f1 includes DIM2
along with its own DIM3 and S1.f2 includes DIM2 from its parent structure.
[unlimitedsequence]
If the underlying DDS has a certain form, and this client parameter is specified,
then it is possible to access sequence data that would otherwise be unavailable.
It is important to note that setting this flag has performance consequences
because the Sequence data must be scanned to count the actual number of
records. This record count is recorded as the size of the UNLIMITED dimension
and the variables (i.e. Sequence fields) all will have UNLIMITED as there firsts
dimension. The requirement that the DDS must meet is this: there must be
exactly one Sequence type defined and it must be at the “top level” (i.e. it
must be a direct field of the Dataset.
Chapter 5: NetCDF Utilities 49
5 NetCDF Utilities
One of the primary reasons for using the netCDF interface for applications that deal with
arrays is to take advantage of higher-level netCDF utilities and generic applications for
netCDF data. Currently two netCDF utilities are available as part of the netCDF software
distribution:
ncdump reads a netCDF dataset and prints a textual representation of the information
in the dataset
ncgen/ncgen4
reads a textual representation of a netCDF dataset and generates the cor-
responding binary netCDF file or a C or FORTRAN program to create the
netCDF dataset
Users have contributed other netCDF utilities, and various visualization and analysis
packages are available that access netCDF data. For an up-to-date list of freely-available
and commercial software that can access or manipulate netCDF data, see the NetCDF
Software list, http://www.unidata.ucar.edu/netcdfsoftware.html.
This chapter describes the ncgen, ncgen4, and ncdump utilities. These three tools convert
between binary netCDF datasets and a text representation of netCDF datasets. The output
of ncdump and the input to ncgen is a text description of a netCDF dataset in a tiny
language known as CDL (network Common data form Description Language).
dimensions:
lat = 10, lon = 5, time = unlimited;
variables:
int lat(lat), lon(lon), time(time);
float z(time,lat,lon), t(time,lat,lon);
double p(time,lat,lon);
int rh(time,lat,lon);
lat:units = "degrees_north";
lon:units = "degrees_east";
time:units = "seconds";
z:units = "meters";
z:valid_range = 0., 5000.;
p:_FillValue = -9999.;
rh:_FillValue = -1;
50 The NetCDF Users’ Guide
data:
lat = 0, 10, 20, 30, 40, 50, 60, 70, 80, 90;
lon = -140, -118, -96, -84, -52;
}
All CDL statements are terminated by a semicolon. Spaces, tabs, and newlines can be
used freely for readability. Comments may follow the double slash characters ’//’ on any
line.
A CDL description for a classic model file consists of three optional parts: dimensions,
variables, and data. The variable part may contain variable declarations and attribute
assignments. For the enhanced model supported by netCDF-4, a CDL decription may also
includes groups, subgroups, and user-defined types.
A dimension is used to define the shape of one or more of the multidimensional variables
described by the CDL description. A dimension has a name and a length. At most one
dimension in a classic CDL description can have the unlimited length, which means a
variable using this dimension can grow to any length (like a record number in a file). Any
number of dimensions can be declared of unlimited length in CDL for an enhanced model
file.
A variable represents a multidimensional array of values of the same type. A variable
has a name, a data type, and a shape described by its list of dimensions. Each variable
may also have associated attributes (see below) as well as data values. The name, data
type, and shape of a variable are specified by its declaration in the variable section of a
CDL description. A variable may have the same name as a dimension; by convention such
a variable contains coordinates of the dimension it names.
An attribute contains information about a variable or about the whole netCDF dataset or
containing group. Attributes may be used to specify such properties as units, special values,
maximum and minimum valid values, and packing parameters. Attribute information is
represented by single values or one-dimensional arrays of values. For example, “units” might
be an attribute represented by a string such as “celsius”. An attribute has an associated
variable, a name, a data type, a length, and a value. In contrast to variables that are
intended for data, attributes are intended for ancillary data or metadata (data about data).
In CDL, an attribute is designated by a variable and attribute name, separated by a
colon (’:’). It is possible to assign global attributes to the netCDF dataset as a whole
by omitting the variable name and beginning the attribute name with a colon (’:’). The
data type of an attribute in CDL, if not explicitly specified, is derived from the type of
the value assigned to it. The length of an attribute is the number of data values or the
number of characters in the character string assigned to it. Multiple values are assigned to
non-character attributes by separating the values with commas (’,’). All values assigned to
an attribute must be of the same type. In the netCDF-4 enhanced model, attributes may
be declared to be of user-defined type, like variables.
In CDL, just as for netCDF, the names of dimensions, variables and attributes (and,
in netCDF-4 files, groups, user-defined types, compound member names, and enumeration
symbols) consist of arbitrary sequences of alphanumeric characters, underscore ’ ’, period
’.’, plus ’+’, hyphen ’-’, or at sign ’@’, but beginning with a letter or underscore. However
names commencing with underscore are reserved for system use. Case is significant in
netCDF names. A zero-length name is not allowed. Some widely used conventions restrict
Chapter 5: NetCDF Utilities 51
names to only alphanumeric characters or underscores. Names that have trailing space
characters are also not permitted.
Beginning with versions 3.6.3 and 4.0, names may also include UTF-8 encoded Unicode
characters as well as other special characters, except for the character ’/’, which may not
appear in a name (because it is reserved for path names of nested groups). In CDL, most
special characters are escaped with a backslash ’\’ character, but that character is not
actually part of the netCDF name. The special characters that do not need to be escaped
in CDL names are underscore ’ ’, period ’.’, plus ’+’, hyphen ’-’, or at sign ’@’. For the
formal specification of CDL name syntax See Section 1.3 [Format], page 6. Note that by
using special characters in names, you may make your data not compliant with conventions
that have more stringent requirements on valid names for netCDF components, for example
the CF Conventions.
The names for the primitive data types are reserved words in CDL, so names of variables,
dimensions, and attributes must not be primitive type names.
The optional data section of a CDL description is where netCDF variables may be
initialized. The syntax of an initialization is simple:
variable = value_1, value_2, ...;
The comma-delimited list of constants may be separated by spaces, tabs, and newlines.
For multidimensional arrays, the last dimension varies fastest. Thus, row-order rather than
column order is used for matrices. If fewer values are supplied than are needed to fill a
variable, it is extended with the fill value. The types of constants need not match the type
declared for a variable; coercions are done to convert integers to floating point, for example.
All meaningful type conversions among primitive types are supported.
A special notation for fill values is supported: the ‘_’ character designates a fill value for
variables.
5.4 ncgen
The ncgen tool generates a netCDF file or a C or FORTRAN program that creates a netCDF
dataset. If no options are specified in invoking ncgen, the program merely checks the syntax
of the CDL input, producing error messages for any violations of CDL syntax.
In the current release, the ncgen utility can only generate classic-model netCDF-4 files or
programs. A separate experimental utility, ncgen4 (built when –enable-ncgen4 is specified at
build time), can be used to generate netCDF-4 files or C code for the enhanced data model.
The ncgen4 utility also handles the special virtual attributes produced by the ncdump ’-s’
option to describe implementation characteristics such as compression and chunking. It
does not yet generate Fortran code.
54 The NetCDF Users’ Guide
-b Create a (binary) netCDF file. If the ’-o’ option is absent, a default file name
will be constructed from the netCDF name (specified after the netcdf keyword
in the input) by appending the ’.nc’ extension. Warning: if a file already exists
with the specified name it will be overwritten.
-o netcdf-file
Name for the netCDF file created. If this option is specified, it implies the ’-b’
option. (This option is necessary because netCDF files are direct-access files
created with seek calls, and hence cannot be written to standard output.)
-c Generate C source code that will create a netCDF dataset matching the netCDF
specification. The C source code is written to standard output. This is only
useful for relatively small CDL files, since all the data is included in variable
initializations in the generated program.
-f Generate FORTRAN source code that will create a netCDF dataset matching
the netCDF specification. The FORTRAN source code is written to standard
output. This is only useful for relatively small CDL files, since all the data is
included in variable initializations in the generated program.
-v2 The generated netCDF file or program will use the version of the format with
64-bit offsets, to allow for the creation of very large files. These files are not as
portable as classic format netCDF files, because they require version 3.6.0 or
later of the netCDF library.
-v3 The generated netCDF file will be in netCDF-4/HDF5 format. These files are
not as portable as classic format netCDF files, because they require version 4.0
or later of the netCDF library.
-x Use “no fill” mode, omitting the initialization of variable values with fill values.
This can make the creation of large files much faster, but it will also eliminate
the possibility of detecting the inadvertent reading of values that haven’t been
written.
Examples
Check the syntax of the CDL file foo.cdl:
ncgen foo.cdl
From the CDL file foo.cdl, generate an equivalent binary netCDF file named bar.nc:
ncgen -o bar.nc foo.cdl
From the CDL file foo.cdl, generate a C program containing netCDF function invocations
that will create an equivalent binary netCDF dataset:
ncgen -c foo.cdl > foo.c
Chapter 5: NetCDF Utilities 55
5.5 ncdump
The ncdump tool generates the CDL text representation of a netCDF dataset on standard
output, optionally excluding some or all of the variable data in the output. The output from
ncdump is intended to be acceptable as input to ncgen. Thus ncdump and ncgen can be
used as inverses to transform data representation between binary and text representations.
As of NetCDF version 4.1, ncdump can also access DAP data sources if DAP support is
enabled in the underlying NetCDF library. Instead of specifying a file name as argument
to ncdump, the user specifies a URL to a DAP source.
ncdump may also be used as a simple browser for netCDF datasets, to display the
dimension names and lengths; variable names, types, and shapes; attribute names and
values; and optionally, the values of data for all variables or selected variables in a netCDF
dataset.
ncdump defines a default format used for each type of netCDF variable data, but this
can be overridden if a C format attribute is defined for a netCDF variable. In this case,
ncdump will use the C format attribute to format values for that variable. For example,
if floating-point data for the netCDF variable Z is known to be accurate to only three
significant digits, it might be appropriate to use this variable attribute:
Z:C_format = "%.3g"
ncdump uses ’ ’ to represent data values that are equal to the FillValue attribute for
a variable, intended to represent data that has not yet been written. If a variable has no
FillValue attribute, the default fill value for the variable type is used unless the variable is
of byte type.
UNIX syntax for invoking ncdump:
ncdump [ -c | -h] [-v var1,...] [-b lang] [-f lang]
[-l len] [ -p fdig[,ddig]] [ -s ] [ -n name] [input-file]
where:
-c Show the values of coordinate variables (variables that are also dimensions) as
well as the declarations of all dimensions, variables, and attribute values. Data
values of non-coordinate variables are not included in the output. This is often
the most suitable option to use for a brief look at the structure and contents of
a netCDF dataset.
-h Show only the header information in the output, that is, output only the dec-
larations for the netCDF dimensions, variables, and attributes of the input file,
but no data values for any variables. The output is identical to using the ’-c’
option except that the values of coordinate variables are not included. (At most
one of ’-c’ or ’-h’ options may be present.)
-v var1,...
The output will include data values for the specified variables, in addition to the
declarations of all dimensions, variables, and attributes. One or more variables
must be specified by name in the comma-delimited list following this option.
The list must be a single argument to the command, hence cannot contain
blanks or other white space characters. The named variables must be valid
netCDF variables in the input-file. The default, without this option and in the
56 The NetCDF Users’ Guide
absence of the ’-c’ or ’-h’ options, is to include data values for all variables in
the output.
-b lang A brief annotation in the form of a CDL comment (text beginning with the
characters ’//’) will be included in the data section of the output for each
’row’ of data, to help identify data values for multidimensional variables. If
lang begins with ’C’ or ’c’, then C language conventions will be used (zero-
based indices, last dimension varying fastest). If lang begins with ’F’ or ’f’,
then FORTRAN language conventions will be used (one-based indices, first
dimension varying fastest). In either case, the data will be presented in the
same order; only the annotations will differ. This option may be useful for
browsing through large volumes of multidimensional data.
-f lang Full annotations in the form of trailing CDL comments (text beginning with the
characters ’//’) for every data value (except individual characters in character
arrays) will be included in the data section. If lang begins with ’C’ or ’c’, then
C language conventions will be used (zero-based indices, last dimension varying
fastest). If lang begins with ’F’ or ’f’, then FORTRAN language conventions
will be used (one-based indices, first dimension varying fastest). In either case,
the data will be presented in the same order; only the annotations will differ.
This option may be useful for piping data into other filters, since each data
value appears on a separate line, fully identified. (At most one of ’-b’ or ’-f’
options may be present.)
-l len Changes the default maximum line length (80) used in formatting lists of non-
character data values.
-p float_digits[,double_digits]
Specifies default precision (number of significant digits) to use in displaying
floating-point or double precision data values for attributes and variables. If
specified, this value overrides the value of the C format attribute, if any, for
a variable. Floating-point data will be displayed with float digits significant
digits. If double digits is also specified, double-precision values will be displayed
with that many significant digits. In the absence of any ’-p’ specifications,
floating-point and double-precision data are displayed with 7 and 15 significant
digits respectively. CDL files can be made smaller if less precision is required.
If both floating-point and double precisions are specified, the two values must
appear separated by a comma (no blanks) as a single argument to the command.
-n name CDL requires a name for a netCDF dataset, for use by ’ncgen -b’ in generating
a default netCDF dataset name. By default, ncdump constructs this name from
the last component of the file name of the input netCDF dataset by stripping off
any extension it has. Use the ’-n’ option to specify a different name. Although
the output file name used by ’ncgen -b’ can be specified, it may be wise to have
ncdump change the default name to avoid inadvertently overwriting a valuable
netCDF dataset when using ncdump, editing the resulting CDL file, and using
’ncgen -b’ to generate a new netCDF dataset from the edited CDL file.
-s Specifies that special virtual attributes should be output for the file format
variant and for variable properties such as compression, chunking, and other
Chapter 5: NetCDF Utilities 57
Examples
Look at the structure of the data in the netCDF dataset foo.nc:
ncdump -c foo.nc
Produce an annotated CDL version of the structure and data in the netCDF dataset
foo.nc, using C-style indexing for the annotations:
ncdump -b c foo.nc > foo.cdl
Output data for only the variables uwind and vwind from the netCDF dataset foo.nc,
and show the floating-point data with only three significant digits of precision:
ncdump -v uwind,vwind -p 3 foo.nc
Produce a fully-annotated (one data value per line) listing of the data for the variable
omega, using FORTRAN conventions for indices, and changing the netCDF dataset name
in the resulting CDL file to omega:
ncdump -v omega -f fortran -n omega foo.nc > Z.cdl
Examine the translated DDS for the DAP source from the specified URL.
ncgen -h http://test.opendap.org:8080/dods/dts/test.01
5.6 ncgen4
The ncgen4 tool is an experimental version of ncgen that is capable of producing netcdf-4
files. It operates essentially identically to ncgen, except that generation of FORTRAN code
is not supported.
The CDL input to ncgen4 may include data model constructs from the netcdf-4 data
model. In particular, it includes new primitive types such as unsigned integers and strings,
opaque data, enumerations, and user-defined constructs using vlen and compound types.
The ncgen4 man page should be consulted for more detailed information.
Appendix A: Units 59
Appendix A Units
The Unidata Program Center has developed a units library to convert between formatted
and binary forms of units specifications and perform unit algebra on the binary form.
Though the units library is self-contained and there is no dependency between it and
the netCDF library, it is nevertheless useful in writing generic netCDF programs and
we suggest you obtain it. The library and associated documentation is available from
http://www.unidata.ucar.edu/packages/udunits/.
The following are examples of units strings that can be interpreted by the utScan()
function of the Unidata units library:
10 kilogram.meters/seconds2
10 kg-m/sec2
10 kg m/s^2
10 kilogram meter second-2
(PI radian)2
degF
100rpm
geopotential meters
33 feet water
milliseconds since 1992-12-31 12:34:0.1 -7:00
A unit is specified as an arbitrary product of constants and unit-names raised to arbitrary
integral powers. Division is indicated by a slash ’/’. Multiplication is indicated by white
space, a period ’.’, or a hyphen ’-’. Exponentiation is indicated by an integer suffix or
by the exponentiation operators ’^’ and ’**’. Parentheses may be used for grouping and
disambiguation. The time stamp in the last example is handled as a special case.
Arbitrary Galilean transformations (i.e., y = ax + b) are allowed. In particular, temper-
ature conversions are correctly handled. The specification:
degF 32
indicates a Fahrenheit scale with the origin shifted to thirty-two degrees Fahrenheit (i.e.,
to zero Celsius). Thus, the Celsius scale is equivalent to the following unit:
1.8 degF 32
Note that the origin-shift operation takes precedence over multiplication. In order of
increasing precedence, the operations are division, multiplication, origin-shift, and expo-
nentiation.
utScan() understands all the SI prefixes (e.g. "mega" and "milli") plus their abbrevia-
tions (e.g. "M" and "m")
The function utPrint() always encodes a unit specification one way. To reduce misunder-
standings, it is recommended that this encoding style be used as the default. In general, a
unit is encoded in terms of basic units, factors, and exponents. Basic units are separated by
spaces, and any exponent directly appends its associated unit. The above examples would
be encoded as follows:
10 kilogram meter second-2
9.8696044 radian2
0.555556 kelvin 255.372
60 The NetCDF Users’ Guide
For additional information on this units library, please consult the manual pages that
come with its distribution.
Appendix B: Attribute Conventions 61
add_offset
If present for a variable, this number is to be added to the data after it is read
by the application that accesses the data. If both scale factor and add offset
attributes are present, the data are first scaled before the offset is added. The
attributes scale factor and add offset can be used together to provide simple
data compression to store low-resolution floating-point data as small integers in
a netCDF dataset. When scaled data are written, the application should first
subtract the offset and then divide by the scale factor, rounding the result to
the nearest integer to avoid a bias caused by truncation towards zero.
When scale factor and add offset are used for packing, the associated variable
(containing the packed data) is typically of type byte or short, whereas the
unpacked values are intended to be of type float or double. The attributes
scale factor and add offset should both be of the type intended for the unpacked
data, e.g. float or double.
_FillValue
The FillValue attribute specifies the fill value used to pre-fill disk space al-
located to the variable. Such pre-fill occurs unless no-fill mode is set using
nc set fill in C (see Section “nc set fill” in The NetCDF C Interface Guide) or
NF SET FILL in Fortran (see Section “NF SET FILL” in The NetCDF For-
tran 77 Interface Guide). The fill value is returned when reading values that
were never written. If FillValue is defined then it should be scalar and of the
same type as the variable. It is not necessary to define your own FillValue
attribute for a variable if the default fill value for the type of the variable is
adequate. However, use of the default fill value for data type byte is not recom-
mended. Note that if you change the value of this attribute, the changed value
applies only to subsequent writes; previously written data are not changed.
Generic applications often need to write a value to represent undefined or miss-
ing values. The fill value provides an appropriate value for this purpose because
it is normally outside the valid range and therefore treated as missing when read
by generic applications. It is legal (but not recommended) for the fill value to
be within the valid range.
For more information for C programmers see Section “Fill Values” in The
NetCDF C Interface Guide. For more information for Fortran programmers
see Section “Fill Values” in The NetCDF Fortran 77 Interface Guide.
missing_value
This attribute is not treated in any special way by the library or conforming
generic applications, but is often useful documentation and may be used by
specific applications. The missing value attribute can be a scalar or vector
containing values indicating missing data. These values should all be outside
the valid range so that generic applications will treat them as missing.
signedness
Deprecated attribute, originally designed to indicate whether byte values should
be treated as signed or unsigned. The attributes valid min and valid max may
be used for this purpose. For example, if you intend that a byte variable store
Appendix B: Attribute Conventions 63
only non-negative values, you can use valid min = 0 and valid max = 255. This
attribute is ignored by the netCDF library.
C_format A character array providing the format that should be used by C applications
to print values for this variable. For example, if you know a variable is only ac-
curate to three significant digits, it would be appropriate to define the C format
attribute as "%.3g". The ncdump utility program uses this attribute for vari-
ables for which it is defined. The format applies to the scaled (internal) type
and value, regardless of the presence of the scaling attributes scale factor and
add offset.
FORTRAN_format
A character array providing the format that should be used by FORTRAN
applications to print values for this variable. For example, if you know a variable
is only accurate to three significant digits, it would be appropriate to define the
FORTRAN format attribute as "(G10.3)".
title A global attribute that is a character array providing a succinct description of
what is in the dataset.
history A global attribute for an audit trail. This is a character array with a line
for each invocation of a program that has modified the dataset. Well-behaved
generic netCDF applications should append a line containing: date, time of
day, user name, program name and command arguments.
Conventions
If present, ’Conventions’ is a global attribute that is a character array for the
name of the conventions followed by the dataset, in the form of a string that
is interpreted as a directory name relative to a directory that is a repository of
documents describing sets of discipline-specific conventions. This permits a hi-
erarchical structure for conventions and provides a place where descriptions and
examples of the conventions may be maintained by the defining institutions and
groups. The conventions directory name is currently interpreted relative to the
directory pub/netcdf/Conventions/ on the host machine ftp.unidata.ucar.edu.
Alternatively, a full URL specification may be used to name a WWW site where
documents that describe the conventions are maintained.
For example, if a group named NUWG agrees upon a set of conventions for
dimension names, variable names, required attributes, and netCDF representa-
tions for certain discipline-specific data structures, they may store a document
describing the agreed-upon conventions in a dataset in the NUWG/ subdirec-
tory of the Conventions directory. Datasets that followed these conventions
would contain a global Conventions attribute with value "NUWG".
Later, if the group agrees upon some additional conventions for a specific sub-
set of NUWG data, for example time series data, the description of the addi-
tional conventions might be stored in the NUWG/Time series/ subdirectory,
and datasets that adhered to these additional conventions would use the global
Conventions attribute with value "NUWG/Time series", implying that this
dataset adheres to the NUWG conventions and also to the additional NUWG
time-series conventions.
Appendix C: File Format Specification 65
this appendix include CDL (“Common Data Language”, the original ASCII form of binary
netCDF data), and NcML (NetCDF Markup Language, an XML-based representation for
netCDF metadata and data).
Knowledge of format details is not required to read or write netCDF datasets. Software
that reads netCDF data using the reference implementation automatically detects and uses
the correct version of the format for accessing data. Understanding details may be helpful
for understanding performance issues related to disk or server access.
The netCDF reference library, developed and supported by Unidata, is written in C,
with Fortran77, Fortran90, and C++ interfaces. A number of community and commercially
supported interfaces to other languages are also available, including IDL, Matlab, Perl,
Python, and Ruby. An independent implementation, also developed and supported by
Unidata, is written entirely in Java.
ever performance features of the netCDF-4 formats that do not require additional features
of the enhanced model, such as per-variable compression and chunking, efficient dynamic
schema changes, and larger variable size limits, offer potentially significant performance
improvements to readers of data stored in this format, without requiring program changes.
numrecs field in the header will not be updated as records are added to the file. [This
feature is not yet implemented].
Note on padding: In the special case of only a single record variable of character, byte,
or short type, no padding is used between data values.
Note on byte data: It is possible to interpret byte data as either signed (-128 to 127)
or unsigned (0 to 255). When reading byte data through an interface that converts it
into another numeric type, the default interpretation is signed. There are various attribute
conventions for specifying whether bytes represent signed or unsigned data, but no standard
convention has been established. The variable attribute “ Unsigned” is reserved for this
purpose in future implementations.
Note on char data: Although the characters used in netCDF names must be encoded
as UTF-8, character data may use other encodings. The variable attribute “ Encoding” is
reserved for this purpose in future implementations.
Note on fill values: Because data variables may be created before their values are written,
and because values need not be written sequentially in a netCDF file, default “fill values”
are defined for each type, for initializing data values before they are explicitly written. This
makes it possible to detect reading values that were never written. The variable attribute
“ FillValue”, if present, overrides the default fill value for a variable. If FillValue is defined
then it should be scalar and of the same type as the variable.
Fill values are not required, however, because netCDF libraries have traditionally sup-
ported a “no fill” mode when writing, omitting the initialization of variable values with fill
values. This makes the creation of large files faster, but also eliminates the possibility of
detecting the inadvertent reading of values that haven’t been written.
Examples
By using the grammar above, we can derive the smallest valid netCDF file, having no
dimensions, no variables, no attributes, and hence, no data. A CDL representation of the
empty netCDF file is
netcdf empty { }
This empty netCDF file has 32 bytes. It begins with the four-byte “magic number”
that identifies it as a netCDF version 1 file: ‘C’, ‘D’, ‘F’, ‘\x01’. Following are seven 32-bit
integer zeros representing the number of records, an empty list of dimensions, an empty list
of global attributes, and an empty list of variables.
Below is an (edited) dump of the file produced using the Unix command
od -xcs empty.nc
Each 16-byte portion of the file is displayed with 4 lines. The first line displays the bytes
in hexadecimal. The second line displays the bytes as characters. The third line displays
each group of two bytes interpreted as a signed 16-bit integer. The fourth line (added by
human) presents the interpretation of the bytes in terms of netCDF components and values.
4344 4601 0000 0000 0000 0000 0000 0000
C D F 001 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
17220 17921 00000 00000 00000 00000 00000 00000
[magic number ] [ 0 records ] [ 0 dimensions (ABSENT) ]
data:
vx = 3, 1, 4, 1, 5 ;
}
which corresponds to a 92-byte netCDF file. The following is an edited dump of this file:
4344 4601 0000 0000 0000 000a 0000 0001
C D F 001 \0 \0 \0 \0 \0 \0 \0 \n \0 \0 \0 001
17220 17921 00000 00000 00000 00010 00000 00001
[magic number ] [ 0 records ] [NC_DIMENSION ] [ 1 dimension ]
Index
attributes, operations on . . . . . . . . . . . . . . . . . . . . . . . 21
_FillValue. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
IONBF flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
B
buffers, I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
6 byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
64-bit offset file format . . . . . . . . . . . . . . . . . . . . . . . . 35 byte array vs. text string . . . . . . . . . . . . . . . . . . . . . . 33
64-bit offset format, introduction. . . . . . . . . . . . . . . 37 byte CDL constant . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
64-bit offset format, limitations . . . . . . . . . . . . . . . . 38 byte, CDL data type . . . . . . . . . . . . . . . . . . . . . . . . . . 51
64-bit offsets, history . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 byte, signed vs. unsigned . . . . . . . . . . . . . . . . . . . . . . 25
A C
access C example of array section . . . . . . . . . . . . . . 30 C API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
access Fortran example of array section . . . . . . . . 32 C code via ncgen, generating. . . . . . . . . . . . . . . . . . . 53
access random . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 C code via ncgen4, generating . . . . . . . . . . . . . . . . . 57
access shared dataset I/O . . . . . . . . . . . . . . . . . . . . . . 39 C++ API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
ADA API, history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 C_format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
add_offset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 CANDIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
ancillary data as attributes . . . . . . . . . . . . . . . . . . . . 23 CDF1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
ancillary data, storing . . . . . . . . . . . . . . . . . . . . . . . . . 21 CDF2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
API, C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 CDL attributes, defining . . . . . . . . . . . . . . . . . . . . . . . 49
API, C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 CDL constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
API, F90 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 CDL data types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
API, Fortran . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 CDL dimensions, defining . . . . . . . . . . . . . . . . . . . . . . 49
API, Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 CDL syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
appending data along unlimited dimension . . . . . 19 CDL variables, defining . . . . . . . . . . . . . . . . . . . . . . . . 49
applications, generic . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 CDL, defining attributes . . . . . . . . . . . . . . . . . . . . . . . 21
applications, generic, conventions . . . . . . . . . . . . 8, 61 CDL, defining global attributes . . . . . . . . . . . . . . . . 21
applications, generic, reasons for netCDF . . . . . . 49 CDL, example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
applications, generic, units . . . . . . . . . . . . . . . . . . . . . 59 char . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
archive format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 char, CDL data type. . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Argonne National Laboratory . . . . . . . . . . . . . . . . . . . 9 classic file format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
array section, C example . . . . . . . . . . . . . . . . . . . . . . . 30 classic format, introduction . . . . . . . . . . . . . . . . . . . . 37
array section, corner . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 classic format, limitations . . . . . . . . . . . . . . . . . . . . . . 38
array section, definition . . . . . . . . . . . . . . . . . . . . . . . . 29 classic netCDF format . . . . . . . . . . . . . . . . . . . . . . . . . 12
array section, edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 common data form language . . . . . . . . . . . . . . . . . . . 17
array section, Fortran example . . . . . . . . . . . . . . . . . 32 compound type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
array section, mapped . . . . . . . . . . . . . . . . . . . . . . . . . 29 compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
arrays, ragged . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
ASCII characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 conventions, attributes . . . . . . . . . . . . . . . . . . . . . . . . . 61
attribute conventions . . . . . . . . . . . . . . . . . . . . . . . . . . 61 conventions, introduction . . . . . . . . . . . . . . . . . . . . . . . 8
attributes associated with a variable . . . . . . . . . . . 20 conventions, naming . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
attributes vs. variables . . . . . . . . . . . . . . . . . . . . . . . . . 23 conversion of data types, introduction . . . . . . . . . . 25
attributes, adding to existing dataset . . . . . . . . . . 21 coordinate variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
attributes, CDL, defining . . . . . . . . . . . . . . . . . . . . . . 49
attributes, CDL, global . . . . . . . . . . . . . . . . . . . . . . . . 49
attributes, CDL, initializing. . . . . . . . . . . . . . . . . . . . 52 D
attributes, data type . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 DAP support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
attributes, data types, CDL . . . . . . . . . . . . . . . . . . . . 52 data base . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
attributes, defined . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 data model, netCDF . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
attributes, defining in CDL . . . . . . . . . . . . . . . . . . . . 21 data structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
attributes, global . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 data types, conversion . . . . . . . . . . . . . . . . . . . . . . . . . 33
attributes, length, CDL . . . . . . . . . . . . . . . . . . . . . . . . 52 data types, external . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
76 The NetCDF Users’ Guide
O U
opaque type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 ubyte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
OpenDAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 udunits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
operations on attributes . . . . . . . . . . . . . . . . . . . . . . . 21 uint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
uint64 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
UNICOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
P units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
parallel access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 units library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
performance of NetCDF . . . . . . . . . . . . . . . . . . . . . . . 35 University of Miami . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
performance, introduction. . . . . . . . . . . . . . . . . . . . . . . 8 unlimited dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . 19
plans for netCDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 user defined types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
pong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 ushort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
primary variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
python API, history . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
V
R valid_max . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
real . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 valid_min . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
real, CDL data type . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 valid_range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
references . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 variable length array type. . . . . . . . . . . . . . . . . . . . . . 27
ruby API, history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 variable types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
variables vs. attributes . . . . . . . . . . . . . . . . . . . . . . . . . 23
variables, CDL, defining . . . . . . . . . . . . . . . . . . . . . . . 49
S variables, CDL, initializing . . . . . . . . . . . . . . . . . . . . . 52
scale_factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 variables, coordinate . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
SeaSpace, Inc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 variables, data types, CDL . . . . . . . . . . . . . . . . . . . . . 52
share flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 variables, defined . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
shared dataset I/O access . . . . . . . . . . . . . . . . . . . . . . 39 variables, primary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
short . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 vlen type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
short, CDL data type . . . . . . . . . . . . . . . . . . . . . . . . . . 51
signedness. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
SNIDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 W
software list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 WetCDF, history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
storing ancillary data . . . . . . . . . . . . . . . . . . . . . . . . . . 21 workshop, CDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
string . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 writers, multiple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
structures, data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
supported programming languages . . . . . . . . . . . . . . 3
X
XDR format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
T XDR layer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Tcl/Tk API, history . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 XDR, introduction into netCDF . . . . . . . . . . . . . . . . 9