ICON Tutorial 2017
ICON Tutorial 2017
February/March 2017
Practical Exercises
for NWP Mode and ICON-ART
Max-Planck-Institut
für Meteorologie
Acknowledgments
This tutorial is based on the extensive COSMO Model Tutorial written by U. Schättler,
U. Blahak, M. Baldauf et al. (2016).
Chapter 9 was provided by R. Potthast and A. Fernandez del Rio, DWD Data Assimilation
division.
Section 5.2.4 has been contributed by A. Seifert, DWD Physical Processes division.
Chapter 8 was provided by the Institute of Meteorology and Climate Research at the
Karlsruhe Institute of Technology (KIT).
Bibliography
Contents
0. Preface 1
0.1. How This Document Is Organized . . . . . . . . . . . . . . . . . . . . . . . 1
0.2. How to Obtain a Copy of the ICON Model Code . . . . . . . . . . . . . . . 2
0.3. Further Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Contents
Contents iii
Bibliography 147
Contents
1
0. Preface
The main goals formulated in the initial phase of the collaboration were
• better conservation properties than in the existing global models, with the obligatory
requirement of exact local mass conservation and mass-consistent transport,
• the availability of some means of static mesh refinement. ICON is capable of mixing
one-way nested and two-way nested grids within one model application, combined
with an option for vertical nesting. This allows the global grid to extend into the
mesosphere (which greatly facilitates the assimilation of satellite data) whereas the
nested domains extend only into the lower stratosphere in order to save computing
time.
The ICON modelling framework became operational in DWD’s forecast system in Jan-
uary 2015. During the first six months only global simulations were executed with a hor-
izontal resolution of 13 km and 90 vertical levels. Starting from July 21st , 2015, model
simulations have been complemented by a nesting region over Europe.
The model source code has been made available for scientific use under an institutional
license since 2015.
Not all topics in this manuscript are covered during the workshop. Therefore, the
manuscript can be used as a textbook, similar to a user manual for the ICON model.
Readers are assumed to have a basic knowledge of the design and usage of numerical
weather prediction models.
Even though the chapters in this textbook are largely independent, they should preferably
not be treated in an arbitrary order.
• New users who are interested in the regional model should read Chapter 5 in addition.
To some extent this document can also be used as a reference manual. To this end, we
refer to the index of namelist parameters on page 151.
Each of the chapters concludes with a number of exercises revisiting the topics from that
part. Paragraphs describing common pitfalls and containing details for advanced users are
marked by the symbol .
For data requests with respect to DWD operational data products please contact
[email protected]. Access to the grid generator web service (see Section 2.2.2)
requires a user account. To this end, please contact [email protected].
For model users who intend to process data products of DWD’s operational runs, the
DWD database documentation may be a valuable resource. It can be found (in English
language) on the DWD web site
www.dwd.de/SharedDocs/downloads/DE/
modelldokumentationen/nwv/icon/icon dbbeschr aktuell.pdf.
The pre- and post-processing tools of the DWD ICON Tools collection are described in
more detail in the DWD ICON Tools manual Prill (2014).
1
An individual licensing procedure has not yet been released by February 2017.
The purpose of this tutorial is to give you some practical experience in installing and run-
ning the ICON model package. Exercises are carried out on the supercomputers at DWD
but the principal steps of the installation can directly be transferred to other systems.
The source code for the ICON model package consists of the following three components:
Figure 1.1 shows a brief description of the directory structure of the ICON model and of
the directories containing the test case data under the root tree.
The ICON model code is located in the directory icon-dev. The most important subdi-
rectories are described in the following:
Subdirectory build
Within the build directory, a subdirectory with the name of your computer archi-
tecture is created during compilation. Within this subdirectory, a bin subdirectory
containing the ICON binary icon and several other subdirectories containing the
compiled module files are created.
icon tutorial
test cases
case1 test case 1: idealized experiment (see Ch. 3)
Figure 1.1.: Directory structure of the ICON model and of the directories containing the
test case data under the root tree.
Subdirectory config
Inside the config directory, different machine-dependent configurations are stored
in configuration script files (see Section 1.2.1).
Subdirectory src
Within the src directory, the source code of ICON including the main program and
ICON modules can be found. The modules are organized in several subdirectories:
The main program icon.f90 can be found inside the subdirectory src/drivers. Ad-
ditionally, this directory contains the modules for a hydrostatic and a nonhydrostatic
setup.
The configuration of ICON run-time settings is performed within the modules inside
src/configure_model and src/namelists. Modules regarding the configuration of
idealized test cases can be found inside src/testcases.
The dynamics of ICON are inside src/atm_dyn_iconam and the physical parame-
terizations inside src/atm_phy_nwp. Surface parameterizations can be found inside
src/lnd_phy_nwp.
The ICON code comes with its own LAPACK and BLAS sources. For performance reasons,
these libraries may be replaced by machine-dependent optimizations. However, please note
that LAPACK and BLAS routines are not actively used by the nonhydrostatic model.
Especially for I/O tasks, the ICON model package requires external libraries. Two data
formats are implemented in the package to read and write data from or to disk: GRIB and
NetCDF.
• NetCDF (Network Common Data Form) is a set of software libraries and machine-
independent data formats that support the creation, access, and sharing of array-
oriented scientific data. NetCDF files contain the complete information about the
dependent variables, the history, and the fields themselves. The NetCDF file format
is also used for the definition of the computational mesh (grid topology).
For more information on NetCDF see http://www.unidata.ucar.edu.
To work with the formats described above the following libraries are implemented in the
ICON model package. For this training course, the paths to access these libraries on the
used computer system are already specified in the Makefile.
This library has been developed and implemented by the Max-Planck-Institute for Mete-
orology in Hamburg. It provides a C and Fortran interface to access climate and NWP
model data. Among others, supported data formats are GRIB1/2 and NetCDF. A con-
densed copy of the CDI is distributed together with the ICON model package. Note that
the CDI are also used by the DWD ICON Tools.
For more information see https://code.zmaw.de/projects/cdi.
The European Centre for Medium-Range Weather Forecasts (ECMWF) has developed an
application programmers interface (API) to pack and unpack GRIB1 as well as GRIB2
formatted data. For setting meta-data, the GRIB-API uses the so-called key/value ap-
proach. Indirect use of this GRIB-API library in the ICON model is implemented through
the CDI.
In addition to the interface routines, there are some command-line tools to provide an
easy way to check and manipulate GRIB data from the shell. Amongst them, the most
important ones are grib ls and grib dump for listing the contents of a grib file, and
grib set for (re)-setting specific key/value pairs.
For more information on GRIB-API we refer to the ECMWF web page:
https://software.ecmwf.int/wiki/display/GRIB/Home
Installation: The source code for the GRIB-API can be downloaded from the
ECMWF web page.
Please refer to the README for installing the GRIB-API libraries, which is done with
a configure script. Check the following settings:
• The GRIB-API can make use of optional JPEG packing of the GRIB records,
but this requires the installation of additional libraries. Since the ICON model
does not apply this packing algorithm, the support for JPEG can be disabled
during the configure step with the option --disable-jpeg.
• To use statically linked libraries and binaries you should set the configure
option --enable-shared=no.
After the configuration has finished, the GRIB-API library can be built with make
and then make install.
An installation of the GRIB-API always consists of two parts: First, there is the binary
compiled library itself with its functions for accessing GRIB files. But, second, there is
the definitions directory which contains plain-text descriptions of meta data. For example,
these definition files contain information about the variable short name and the corre-
sponding GRIB code triplet.
The short name, e.g., “t” for temperature, is not stored in data files, in contrast to the
corresponding GRIB triplet. The definition file therefore constitutes an essential link: If
the definition files on two institutes do not match it is possible that the same data file
shows the record “OMEGA” on one site (our DWD system), while the same GRIB record
bears the short name “w” on the other site (both have indicatorOfParameter=39).
In theory, this situation could be solved by changing all field names in the ICON name
list setup, where possible. However, it is likely that further related errors may follow in
the ICON model when this searches for a specific variable name. In this case you might
need to change the definition files after all.
The DWD definition files for the GRIB-API can be obtained via Github
https://github.com/erget/grib-api.definitions.edzw
A special library, the NetCDF library, is necessary to write and read data using the
NetCDF format. This library also contains tools for manipulating and visualizing the
data (ncdump utility, see Section 7.1.1).
If the library is not yet installed on your system, you can get the source code and docu-
mentation from
http://www.unidata.ucar.edu/software/netcdf/index.html
This includes a description how to install the library on different platforms. Please make
sure that the F90 package is also installed, since the model reads and writes grid data
through the F90 NetCDF functions. While the classic NetCDF format could not deal with
files larger than 2 GiB the new NetCDF-4/HDF5 format permits storing files as large as
the underlying file system supports. However, NetCDF-4/HDF5 files are unreadable to
the NetCDF library before version 4.0.
This section explains the configuration process of the ICON model. It is assumed that the
libraries and programs discussed in Section 1.1.2 are present on your computer system.
For convenience, the compiler version and the GRIB-API version are documented in the
log output of each model run.
For a small number of HPC platforms settings are provided with the code, for example
Table 1.1.: Minimum requirements for Fortran compilers for building the ICON code (state
February 2017 )
A configure file is provided that takes over the main work of generating the compilation
setup. This autoconf configuration is used to analyze the computer architecture (hardware
1
This is one possible compiler version rather than the minimum requirement. Older versions might work
as well.
and software) and sets user specified preferences, e.g. the compiler. These preferences are
read from config/mh-<OS>, where <OS> is the identified operating system.
To configure the source code, please log into the Cray XC 40 login node xce and change
into the subdirectory icon-dev. Then type:
./configure --with-fortran=compiler
where compiler is {gcc,nag,intel,pgi,cray}. The default is gcc. Here, for the DWD
platform, please choose the option --with-fortran=cray.
With the Unix command make the programs are compiled and all object files are linked
to create the binaries. On most machines you can also compile the routines in parallel
by using the GNU-make with the command gmake -j np, where np gives the number of
processors to use (np typically about 8).
During the compilation process, a subdirectory with the name of your computer archi-
tecture is created within the build directory. In this subdirectory, a bin subdirectory
containing the binary icon and several further subdirectories containing the compiled
module files are created.
If you wish to re-configure ICON it is advisable first to clean the old setup by giving:
make distclean
Some more details on configure options can be found in the help of the configure command:
./configure --help
Note for advanced users: Only the Cray XC 40 platform does not require an
explicit “--with-openmp” option for hybrid parallel binaries. If one uses, e.g., the
Intel Fortran compiler, then this option is explicitly needed in the configure process.
The DWD ICON Tools provide a number of utilities for the pre- and post-processing of
ICON model runs. All of these tools can run in parallel on multi-core systems (OpenMP)
and some offer an MPI-parallel execution mode in addition. We give a short overview over
several tools in the following and refer to the documentation (Prill (2014)) for details.
The iconremap utility is especially important for pre-processing the initial data for the
basic test setups in this manuscript. iconremap (ICOsahedral N onhydrostatic model
REMAP ping) is a utility program for horizontally interpolating ICON data onto regu-
lar grids and vice versa. Besides, it offers the possibility to interpolate between triangular
grids of different resolution.
The iconremap tool reads and writes data files in GRIB2 or NetCDF file format. For
triangular grids an additional grid file in NetCDF format must be provided.
Several interpolation algorithms are available: Nearest-neighbor remapping, radial basis
function (RBF) approximation of scalar fields, area-weighted formula for scalar fields, RBF
interpolation for wind fields from cell-centred zonal, meridional wind components u, v to
normal and tangential wind components at edge midpoints of ICON triangular grids (and
reverse), and barycentric interpolation.
Note that iconremap only performs a horizontal remapping, while the vertical
interpolation onto the model levels of ICON is handled independently at startup.
ICONGPI
The iconsub tool (ICOsahedral N onhydrostatic model SUB grid extraction) allows “cut-
ting” sub-areas out of ICON data sets.
After reading a data set on an unstructured ICON grid in GRIB2 or NetCDF file format,
the tool comprises the following functionality: It may ‘cut out” a subset, specified by two
corners and a rotation pole (similar to the COSMO model). Alternatively, a boundary
region of a local ICON grid, specified by parent-child relations, may be extracted. Finally,
the extracted data is stored in GRIB2- or NetCDF file format.
Multiple sub-areas can be extracted in a single run of iconsub.
The icongridgen tool is a simple grid generator. An existing global or local grid file is
taken as input and parts of this input grid (or the whole grid) are refined via bisection.
No storage of global grids is necessary and the tool also provides an HTML plot of the
grid.
To compile the DWD ICON Tools binaries, log into the Cray XC 40 login node xce and
change into the subdirectory icontools by typing
cd dwd_icon_tools/icontools/
You get a list of available compile targets by typing make. The following output is exem-
plary and may differ from your current version:
------------------------------------------------------------------------------------
DWD ICONTOOLS
For example, the binary for the Cray XC 40 can be created by typing
make cray_mpi
ICON Tools Libraries: The DWD ICON Tools are divided into several independent
libraries which can be linked against user applications. The purpose of this hierar-
chy of sub-libraries is to access the high-level API (iconremap, iconsub, icongpi,
icongridgen, icondelaunay) which are part of the overarching libicontools.a
sub-library, or alternatively call the low-level API (interpolation, load grid, query
point etc.) directly. For the latter case, it is not necessary to read namelists and
data via the ICON Tools, since all the necessary data is provided via subroutine
interfaces.
The DWD ICON utilities use the GRIB-API for reading data in GRIB2 format.
The GRIB-API is indirectly accessed by the Climate Data Interface (CDI).
1.4. Exercise
For practical work during these exercises, you need information about the use of the
computer systems and from where you can access the necessary files and data.
Please take a look at Appendix A to get information on how to use DWD’s supercomputer
and run test jobs.
Log into the Cray XC 40 login node “xce” and install the ICON model in the $WORK
directory of your Cray XC 40 user account. For that you have to do the following:
• The necessary files for the ICON tutorial can be found in the subdirectory
EX 1.1
/e/uwork/trng024/packages
• Change into your $WORK directory and extract the compressed tar file to yield
the directory structure depicted in Figure 1.1 containing the ICON sources
and test data.
• Follow the instructions in Section 1.2.2 to configure and compile the ICON
model on the Cray XC 40 platform.
• Change into the subdirectory dwd icon tools and build the DWD ICON
Tools according to Section 1.3.
Besides the source code of the ICON package and the libraries, several data files are
needed to perform runs of the ICON Model. There are four categories of necessary data:
Horizontal grid files, external parameters, and data describing the initial state (DWD
analysis or ECMWF IFS data). Finally, running forecasts with a limited area model in
addition requires accurate boundary conditions sampled at regular time intervals.
In order to run ICON, it is necessary to load the horizontal grid information as an input
parameter. This information is stored within so-called grid files. For an ICON run, at least
one global grid file is required. For model runs with nested grids, additional files of the
nested domains are necessary. Optionally, a reduced radiation grid for the global domain
may be used.
The following nomenclature has been established: In general, by RnBk we denote a grid
that originates from an icosahedron whose edges have been initially divided into n parts,
followed by k subsequent edge bisections. See Figure 2.1 for an illustration of the grid
creation process. The total number of cells in a global ICON grid RnBk is given by
ncells := 20 n2 4k . The effective mesh size can be estimated as
p
∆x = Searth /ncells ≈ 5050/(n 2k ) [km] , (2.1)
where Searth denotes the earth’s surface. Note that by construction, each vertex of a global
grid is adjacent to exactly 6 triangular cells, with the exception of the original vertices of
the icosahedron, the pentagon points, which are adjacent to only 5 cells.
The unstructured triangular ICON grid resulting from the grid generation process is rep-
resented in NetCDF format. This file stores coordinates and topological index relations
between cells, edges and vertices.
The most important data entries of the main grid file are
Figure 2.1.: Illustration of the grid construction procedure. The original spherical icosahe-
dron is shown in red, denoted as R1 B00 following the nomenclature described
in the text. In this example, the initial division (n=2; black dotted), followed
by one subsequent edge bisection (k=1) yields an R2 B01 grid (solid lines).
1
3
1 2
3 0
2
0
2
0 3
1
1
3
2
Figure 2.2.: Illustration of the parent-child relationship in refined grids. Left: Triangle
subdivision and local cell indices. Right: The grids fulfil the ICON requirement
of a right-handed coordinate system [~et , ~en , ~ew ].
Refinement Information
Additional topological information is required for ICON’s refined nests: Each “parent”
triangle is split into four “child” cells. In the grid file only child-to-parent relations are
stored while the parent-to-child relations are computed in the model setup. The local
numbering of the four child cells (see Fig. 2.2) is also computed in the model setup.
The refinement information may be provided in a separate file. This optional grid con-
nectivity file (suffix -grfinfo.nc) acts as a fallback at model startup if the expected
information is not found in the main grid file.
Finally, note that the data points on the triangular grid are the cell circumcenters. There-
fore the global grid data points are closely located to nest data sites, but they do not
coincide exactly.
Introductory Remarks
There are (at least) three grid generation tools available for the ICON model: The ICON
model itself is shipped together with a standalone tool grid command. The executable file
of the grid generator grid command is created automatically during the build process and
is located in the same subdirectory as the model binary. We refer to the documentation
icon-dev/doc/Namelist overview.pdf for details. A different grid generation tool has
been developed at the Max-Planck-Institute for Meteorology by L. Linardakis. Finally,
another grid generator is contained in the DWD ICON Tools.
In this section we will discuss the grid generator that is contained in the DWD ICON
Tools, because this utility also acts as the backend for the publicly available web tool.
The latter is shortly described in Section 2.2.2. It is important to note, however, that
this grid generator is not capable of generating grids with unusual root subdivisions or
non-spherical geometries like torus grids.
The DWD ICON Tools utility icongridgen is mainly controlled using a Fortran namelist.
The command-line option that is used to provide the name of this file and other available
settings are summarized via typing
icongridgen --help
The Fortran namelist gridgen nml contains the filename of the parent grid which is to be
refined and the grid specification is set for each child domain independently. For example
(COSMO-EU nest) the settings are
dom(1)%region_type = 3
dom(1)%lrotate = .true.
dom(1)%hwidth_lon = 20.75
dom(1)%hwidth_lat = 20.50
dom(1)%center_lon = 2.75
dom(1)%center_lat = 0.50
dom(1)%pole_lon = -170.00
dom(1)%pole_lat = 40.00
For a complete list of available namelist parameters we refer to the documentation (Prill
(2014)).
The icongridgen grid generator checks for overlap with concurrent refinement regions,
i.e. no cells are refined which are neighbors or neighbors-of-neighbors (more precisely:
vertex-neighbor cells) of parent cells of another grid nest on the same refinement level.
Grid cells which violate this distance rule are “cut out” from the refinement region. Thus,
there is at least one triangle between concurrent regions.
Minimum distance between child nest boundary and parent boundary: As a second
constraint, in the case that the parent grid itself is a bounded regional grid, no cells
can be refined that are part of the indexing region (of width bdy indexing depth)
in the vicinity of the parent grid’s boundary.
When the grid generator icongridgen is targeted at a limited area setup (for ICON-LAM),
two important namelist settings must be considered:
• Identifying the grid boundary zone. In Section 2.3.4 we will describe how to drive
the ICON limited area model. Creating the appropriate boundary data makes the
identification of a sufficiently large boundary zone necessary.
This indexing is enabled through the following namelist setting in gridgen nml:
bdy indexing depth = 14.
This means that 14 cell rows starting from the nest boundary are marked and can
be identified in the ICON-LAM setup, which is described in Section 2.3.4.
Figure 2.3.: Screenshots of the ICON download server hosted by the ZMAW in Hamburg.
For fixed domain sizes and resolutions a list of grid files has been pre-built for the ICON
model together with the corresponding reduced radiation grids and the external parame-
ters.
The contents of the primary storage directory are regularly mirrored to a public web site
for download, see Figure 2.3 for a screenshot of the ICON grid file server. The download
server can be accessed via
http://icon-downloads.zmaw.de
The pre-defined grids are identified by a centre number, a subcentre number and a num-
berOfGridUsed, the latter being simply an integer number, increased by one with every
new grid that is registered in the download list. Also contained in the download list is
a tree-like illustration which provides information on parent-child relationships between
global and local grids, and global and radiation grids, respectively.
Note that the grid information of some of the older grids (no. 23 – 40) is split over two
files: The users need to download the main grid file itself and a grid connectivity file (suffix
-grfinfo.nc).
ICON data files do not (completely) contain the description of the underlying grid. This
is an important consequence of the fact that ICON uses unstructured, pre-generated com-
putational meshes. Therefore, given a particular data file, one question naturally arises:
Which grid file is related to my simulation data?
The answer to this question can be obtained with the help of two meta-data items which
are part of every ICON data and grid file:
• numberOfGridUsed
This is simply an integer number, as explained in the previous section. The
numberOfGridUsed helps to identify the grid file in the public download list. If the
numberOfGridUsed differs between two given data files, then these are not based on
the same grid file.
• uuidOfHGrid
This acronym stands for universally unique identifier and corresponds to a binary
data tag with a length of 128 bits. The UUID can be viewed as a fingerprint of
the underlying grid. Even though this is usually displayed as a hexadecimal num-
ber string, the UUID indentifier is not human-readable. Nevertheless, two different
UUIDs can be tested for equality or inequality.
The meta-data values for numberOfGridUsed and uuidOfHGrid offer a way to track the
underlying grid file through all transformations in the scientific workflow, for example in
External parameters are used to describe the properties of the earth’s surface. These data
include the topography, the land-sea-mask, and several parameters which are needed to
specify the dominant land use of a grid box like the soil type or the plant cover fraction.
The ExtPar software (ExtPar – External Parameters for numerical weather prediction and
climate application) is able to generate external parameters for the different models GME,
COSMO, HRM and ICON. Experienced users can run ExtPar on UNIX or Linux systems
to transform raw data from various sources into and domain-specific data files. For a more
detailed overview of ExtPar, the reader is referred to the User and Implementation Guide
of ExtPar. The ExtPar preprocessor is a COSMO software and not part of the ICON
training course release.
Similar as for the grid files, for fixed domain sizes and resolutions some external parameter
files for the ICON Model are available for download. For the NWP release these data are
provided in the NetCDF file format and GRIB2.
The topography contained in the ExtPar data files is not identical to the topography
data which is eventually used by the model. This is because at start-up, after
reading the ExtPar data, the topography field is optionally filtered by a smoothing
operator. Therefore, for post-processing purposes it is necessary to specify and use
the topography height topography c (GRIB2 short name HSURF) from the model
output (cf. Section 4.3 and Appendix C).
Additional information for surface tiles: ExtPar data is available with and without
additional information for surface tiles.
Extpar data files suitable for the tile approach are indicated by the suffix tiles.
They are also applicable when running the model without tiles. Extpar files without
the suffix tiles, however, must only be used when running the model without tiles
(lnd nml/ntiles = 1).
The data files do not differ in the number of fields, but only in the way some
fields are defined near coastal regions. Without the tiles suffix, various surface
parameters (e.g. SOILTYP, NDVI MAX) are only defined at so called dominant land
points, i.e. at grid elements where the land fraction exceeds 50%. With the tiles
suffix, however, these parameters are additionally defined at cells where the land
fraction is below 50%. By this, we allow for mixed water-land points. The same
holds for the lake depth (depth lk) which is required by the lake parameterization.
In files without the tiles suffix, it is only defined at dominant lake points.
In addition to the ExtPar products, input fields for radiation are loaded into the ICON
model. These constants fields are distributed together with the model code in the subdi-
rectory icon-dev/data.
A web service has been made available to help users with the generation of custom grid
files. After entering grid coordinates through an online form, this web service creates a
corresponding ICON grid file together with the necessary external parameter file.
Figure 2.4.: Web browser screenshot of the web-based ICON grid generator tool. Left: Web
form. Right: HTML visualization of the resulting grid based on Google Maps.
A user account is required for access: Please contact [email protected] to this end.
Then, visit the web page of the pamore data service
https://webservice.dwd.de/cgi-bin/spp1167/webservice.cgi
and, after logging into the web site, choose External Parameters for ICON and COSMO
→ External Parameters and Grid Files for ICON.
The web form is more or less self-explanatory. The settings reflect the namelist parameters
of the icongridgen grid generator tool that runs as the first stage of the web service. These
are explained in Section 2.1.2 of this tutorial. The second stage, the ExtPar tool, does not
require further settings (with the exception of the surface tiles setting, see below).
The tool is capable of generating multiple grid files at once. Please note that the web-
based grid generation submits a batch job to DWD’s HPC system and takes
some time for processing! Finally all results (and log files) are packed together into
a ∗.tar.gz archive and the user is informed via e-mail about its FTP download site.
Additionally, a web browser visualization of the grids based on Google Maps is provided,
see Fig. 2.4.
Surface tiles: The web-based generator can produce ExtPar data with and with-
out additional information for surface tiles (see the explanation on p. 19). The –
recommended – default has the tile data enabled. This setting can be changed by
disabling the corresponding checkbox in the HTML form.
Minimum version required: Grid files that have been generated by the icongridgen
tool contain only child-to-parent relations while the parent-to-child relations are
computed in the model setup. Therefore these grids must be used together with
ICON versions newer than ∼ September 2016.
Global numerical weather prediction (NWP) is an initial value problem. The ability to
make a skillful forecast heavily depends on the accuracy with which the present atmo-
spheric (and surface/soil) state is known. Running forecasts with a limited area model in
addition requires accurate boundary conditions sampled at regular time intervals.
Initial conditions are usually generated by a process called data assimilation 1 . Data as-
similation combines irregularly distributed (in space and time) observations with a short
term forecast of a general circulation model (e.g. ICON) to provide a ’best estimate’ of
the current atmospheric state. Such analysis products are provided by several global NWP
centers. In the following we will present and discuss the data sets that can be used to drive
the ICON model.
The most straightforward way to initialize ICON is to make use of DWD’s analysis prod-
ucts, which are generated operationally every 3 hours. They are available in GRIB2 format
on the native ICON grid. The analysis products are generated by a hybrid system named
EnVar which combines variational and ensemble methods. See Chapter 9 for more infor-
mation on DWD’s data assimilation system.
When started in this DWD initialization mode, the model loads two files: a first guess and
an analysis file. The term first guess denotes a short-range forecast of the NWP model at
hand, whereas the term analysis denotes all those fields which have been updated by the
assimilation system. This distinction is not strictly necessary for the initialization process,
however it helps to clarify which fields have been touched and updated by the assimilation
system (by making use of observations) and which not. The DWD initialization mode
comes in two flavours, which differ in the way the model state is pulled towards the
analysed state.
Non-Incremental Update
The non-incremental update is conceptually the easiest approach: It starts the model from
the full analysis directly. However, this comes at the price of a somewhat increased noise
level at the beginning of the simulation, due to a missing filtering procedure.
Both the model’s current state, the first guess, and the analysis must have the same validity
time. Table 2.1 provides an overview of the fields that ICON expects to be contained in
the first guess and analysis file at 00UTC. A “yes” in column 2 and 4 indicates that
the corresponding variable is expected to be present in the first guess and analysis file,
respectively. This table shows the optimum situation in the sense that all requested fields
have been found in the input files. This table is part of the ICON runtime log output.
1
Note that for so-called idealized test cases no initial conditions must be read in. All necessary state
variables are preset by analytical values.
As will be explained in Section 9.2, the atmospheric analysis is performed more frequently
compared to the surface analysis. Therefore, the analysis product provided at times dif-
ferent from 00UTC usually contains only a subset of the fields provided at 00UTC. Con-
sequently, Table 2.1 will differ for non-00UTC runs in the sense that the fields
will be read from the first guess file instead of the analysis file.
In the incremental analysis update (IAU) method (Bloom et al., 1996, Polavarapu et al.,
2004) the analysis increment is not added in one time step completely, but it is integrated
(b)
into the model integration and added to the model states xk over an interval 4T , which
by default is 4T = 3 h. This method of tentatively pulling the model from its current
state (first guess) towards the analysed state acts as a low pass filter in frequency domain
on the analysis increments, such that small scale unbalanced modes are effectively filtered.
In the following, let us assume that we want to start a model forecast at 00UTC. Techni-
cally, the application of this method has some potential pitfalls, which the user should be
aware of:
• The analysis file has to contain analysis increments (i.e. deviations from the first
guess) instead of full fields, with validity time 00UTC. The only exceptions are
FR ICE and T SO, which must be full fields (see Table 2.2).
• The model must be started from a first guess which is shifted back in time
by 1.5 h w.r.t. to the analysis. Thus, in the given example, the validity time of
the first guess must be 22:30UTC of the previous day. This is because “drib-
bling” of the analysis increments is performed over the symmetric 3 h time window
[00UTC − 1.5h, 00UTC + 1.5h]. See Section 4.2.3 for an illustration of this process.
Table 2.2 provides an overview of the fields that ICON expects to be contained in the first
guess and analysis file at 00UTC. ICON internal variable names are given in column 1. A
“yes” in columns 2 or 4 indicates that the corresponding variable is expected to be present
in the first guess and analysis file, respectively. This table shows the optimum situation in
the sense that all requested fields have been found in the input files. This table is part of
the ICON runtime log output.
As already explained in the previous section, the analysis product provided at times dif-
ferent from 00UTC will only contain a subset of the fields provided at 00UTC. Table 2.2
will differ for non-00UTC runs in the way that the fields
fr seaice t so w so
will be read from the first guess and not from the analysis file.
variable | FG read attempt | FG found | ANA read attempt | ANA found | data from |
--------- | --------------- | -------- | ---------------- | --------- | --------- |
vn | yes | yes | no | | fg |
w | yes | yes | no | | fg |
rho | yes | yes | no | | fg |
theta_v | yes | yes | no | | fg |
qc | yes | yes | no | | fg |
qi | yes | yes | no | | fg |
qr | yes | yes | no | | fg |
qs | yes | yes | no | | fg |
tke | yes | yes | no | | fg |
gz0 | yes | yes | no | | fg |
t_g | yes | yes | no | | fg |
t_mnw_lk | yes | yes | no | | fg |
t_wml_lk | yes | yes | no | | fg |
h_ml_lk | yes | yes | no | | fg |
t_bot_lk | yes | yes | no | | fg |
c_t_lk | yes | yes | no | | fg |
t_b1_lk | yes | yes | no | | fg |
h_b1_lk | yes | yes | no | | fg |
qv_s | yes | yes | no | | fg |
w_i | yes | yes | no | | fg |
t_so | yes | yes | yes | yes | both |
w_so_ice | yes | yes | no | | fg |
w_snow | yes | yes | no | | fg |
rho_snow | yes | yes | no | | fg |
qv | yes | yes | yes | yes | ana |
u | no | | yes | yes | ana |
v | no | | yes | yes | ana |
temp | no | | yes | yes | ana |
pres | no | | yes | yes | ana |
t_ice | yes | yes | yes | yes | ana |
h_ice | yes | yes | yes | yes | ana |
fr_seaice | yes | yes | yes | yes | ana |
w_so | yes | yes | yes | yes | ana |
t_snow | yes | yes | yes | yes | ana |
h_snow | yes | yes | yes | yes | ana |
freshsnow | yes | yes | yes | yes | ana |
Table 2.1.: First guess (FG) and analysis (ANA) input fields as required when starting
from DWD analysis at 00UTC without IAU. “yes/no” in columns 2 and 4
indicates whether a field is expected or not, while “yes/no” in columns 3 and 5
shows whether a field was found and read, or not. Finally, column 6 indicates
whether the respective field was taken from the FG or ANA or both sources.
IAU and limited area: For limited area runs it is not possible to make use of the
IAU method.
variable | FG read attempt | FG found | ANA read attempt | ANA found | data used |
----------- | --------------- | -------- | ---------------- | --------- | ----------|
vn | yes | yes | no | | fg |
w | yes | yes | no | | fg |
rho | yes | yes | no | | fg |
theta_v | yes | yes | no | | fg |
qv | yes | yes | yes | yes (I) | both |
qc | yes | yes | no | | fg |
qi | yes | yes | no | | fg |
qr | yes | yes | no | | fg |
qs | yes | yes | no | | fg |
tke | yes | yes | no | | fg |
gz0 | yes | yes | no | | fg |
t_g | yes | yes | no | | fg |
t_ice | yes | yes | no | | fg |
h_ice | yes | yes | no | | fg |
t_mnw_lk | yes | yes | no | | fg |
t_wml_lk | yes | yes | no | | fg |
h_ml_lk | yes | yes | no | | fg |
t_bot_lk | yes | yes | no | | fg |
c_t_lk | yes | yes | no | | fg |
t_b1_lk | yes | yes | no | | fg |
h_b1_lk | yes | yes | no | | fg |
qv_s | yes | yes | no | | fg |
w_i | yes | yes | no | | fg |
t_so | yes | yes | yes | yes | both |
w_so | yes | yes | yes | yes (I) | both |
w_so_ice | yes | yes | no | | fg |
t_snow | yes | yes | no | | fg |
rho_snow | yes | yes | no | | fg |
h_snow | yes | yes | yes | yes (I) | both |
freshsnow | yes | yes | yes | yes (I) | both |
snowfrac_lc | yes | yes | no | | fg |
u | no | | yes | yes (I) | ana |
v | no | | yes | yes (I) | ana |
temp | no | | yes | yes (I) | ana |
pres | no | | yes | yes (I) | ana |
fr_seaice | yes | yes | yes | yes | ana |
Table 2.2.: First Guess (FG) and Analysis (ANA) input fields as required when starting
from DWD analysis at 00UTC with IAU. “yes/no” in columns 2 and 4 in-
dicates whether a field is expected, while “yes/no” in columns 3 and 5 shows
whether a field was found and read. The marker (I) in column 5 highlights anal-
ysis increments as opposed to full analysis fields. Finally, column 6 indicates
whether the respective field was taken from the FG or ANA or both sources.
The ICON code contains a script for the automatic request of native (DWD) analysis
data from DWD’s meteorological data management system SKY. It is located in the
subdirectory
SKY4ICON.PY
Retrieve ICON data from the DWD "Sky" database.
positional arguments:
startdate start date [YYYYMMDDhhmmss]
enddate end date [YYYYMMDDhhmmss]
optional arguments:
-h, --help show this help message and exit
--increment INCREMENT
increment time span [h] (default: 24)
--mode MODE mode: 1=IAU, 2=No IAU (default: 1)
--ensemble read ensemble data (default: False)
--emember EMEMBER ensemble member (default: 1)
icon-dev/scripts/preprocessing/sky4icon
This script allows to import analysis data for both IAU and non-IAU runs on the native
ICON grid, including data for the ICON-EU nest, starting from January 1, 2015 until
today.
In order to retrieve, for example, 6-hourly initial data from July 1, 2015 00UTC until
July 3, 2015 00UTC for an IAU-based model initialization, the following command line is
used:
A full set of command-line options can be obtained via sky4icon.py -h, see Fig. 2.5.
Ensemble data: Note that the script supports ensemble data as well. The command-
line options --ensemble --emember EMEMBER allow you to pick one or more anal-
ysis from the LETKF analysis ensemble with 40 km horizontal resolution (member
EMEMBER), as an alternative to the high-resolution (13 km) deterministic analysis
produced by the EnVAR system.
A pre-requisite for running the script is a valid account for the database “roma” (contact
[email protected]). An alternative way to get the data is to contact DWD’s data
center (see Section 0.2).
Model runs can also be initialized by “external” analysis files produced by the Integrated
Forecast System (IFS) that has been developed and is maintained by the European Centre
for Medium-Range Weather Forecasts (ECMWF). Initializing the ICON model with IFS
analysis files requires an additional pre-processing step using the DWD ICON Tools. The
details of this procedure are given below.
The ICON code contains a script for the automatic request for IFS data from
the MARS data base. A full list of mandatory IFS analysis fields is provided
in Table 2.3. The Meteorological Archival and Retrieval System (MARS, see
https://software.ecmwf.int/wiki/display/UDOC/MARS+user+documentation) is
the main repository of meteorological data at ECMWF.
The script for importing from MARS is part of the ICON source code repository.
It is located in the subdirectory
icon-dev/scripts/preprocessing/mars4icon_smi
Important note:
The mars4icon smi must be executed on the ECMWF computer system!
In order to retrieve, for example, T1279 grid data with 137 levels for the July 1, 2013, the
following command line is used:
After the successful MARS download, the IFS data must be interpolated from a regular
grid onto the ICON grid. To this end, the iconremap utility from the DWD ICON Tools
will be used in batch mode. It is important to note that very large grids should always be
processed MPI-parallel in batch mode.
A typical namelist for processing IFS data has the following structure:
&remap_nml
in_grid_filename = "${IFS_FILENAME_GRB}"
in_filename = "${IFS_FILENAME_GRB}"
in_type = 1 ! regular grid
out_grid_filename = "${ICON_GRIDFILE}"
out_filename = "${IFS_FILENAME_NC}"
out_type = 2 ! ICON grid
out_filetype = 4 ! NetCDF format
/
The control file must contain a separate namelist input field nml for each field of the
list of mandatory input fields given in Table 2.3. The u and v wind components require
special treatment. These must be interpolated to edge-normal wind components vn (see
the namelist above).
Note that ICON requires the soil moisture index (SMI) and not the volumetric soil moisture
content (SWV) as input. The conversion of SWV to SMI is currently performed as part
of the MARS request (mars4icon smi). However, this conversion is not reflected in the
variable short names: The fields containing SMI for each surface layer are still termed
SWVLx. The ICON model, however, expects them to be named SMIx. Therefore, the
proper output name SMIx must be specified explicitly in the namelist input field nml
of iconremap (see the example namelist above).
The DWD ICON Tools contain example run scripts for iconremap for a small number of
computing environments. For the Cray XC 40 environment, see
dwd_icon_tools/example/runscripts/xce_ifs2icon.run
for a script that performs an interpolation of IFS data. Of course, the namelist parameters
which are specified in this file can be also ported for corresponding runs of iconremap on
other platforms. A detailed documentation of the ICON remap namelist parameters can
be found under dwd icon tools/doc/icontools doc.pdf, i. e. Prill (2014).
Sometimes it is desirable to run ICON at horizontal resolutions which differ from those
of the initial data. One important application are high-resolution limited area runs, which
start from operational ICON forecasts or analysis. In this case horizontal remapping of
the initial ICON data is necessary.
For limited area runs, the set of variables which must be interpolated onto the target
resolution is identical to the first guess dataset in Table 2.1. To be more precise, all
variables for which the data source in Table 2.1 indicates “fg” or “both” are necessary
and must be remapped. Either analysis or forecast datasets can be used. Due to the
necessary remapping step, however, it is not possible to use tiled surface data (the surface
tile approach is used operationally in ICON). Instead, aggregated surface fields must be
remapped. Remapping of the tiled datasets makes no sense, since the tile-characteristics
can differ significantly between a source and target grid cell.
The DWD ICON Tools contain a run script xce remap inidata which performs the remap-
ping. After adjusting the necessary file and directory names in
dwd_icon_tools/icontools/xce_remap_inidata
the script can be submitted to the PBS batch system of the Cray XC 40.
For each of the variables to be remapped, the script contains a namelist input field nml
which specifies details of the interpolation methods and the output name.
Important note:
• When remapping land surface fields, it is advisable to make use of the land
sea mask information (var in mask="FR LAND" in input field nml). By do-
ing so, we mask out water points so that only land points contribute to the
interpolation stencil.
• For simplicity, the script xce remap inidata interpolates the soil water con-
tent W SO directly. A more accurcate and advisable approach would be to con-
vert W SO into the soil moisture index SMI beforehand and transfer it back to
W SO afterwards.
During the limited area simulations with ICON, boundary conditions are updated periodi-
cally by reading input files. To this end, forecast or analysis data sets from a driving model
need to be available. These data sets may originate from ICON, IFS and COSMO-DE.2
Between two lateral boundary data samples the boundary data is linearly interpolated.
In this section we shortly describe the process of generating boundary data for these runs.
The boundary data files must contain the following set of variables
2
Data sets from other global or regional models may work as well, but have not been tested yet.
The basic preprocessing steps for ICON-LAM are visualized in Figure 2.6.
The DWD ICON Tools contain a run script xce limarea which processes a whole directory
of data files (“DATADIR”), mapping the fields onto the boundary zone of a limited area
grid.
The output files are written to the directory specified in the variable OUTDIR. The input
files are read from DATADIR, therefore this directory should not contain other files and
should not be identical to the output folder.
After adjusting the necessary filenames INGRID and LOCALGRID and the input directory
name DATADIR in dwd_icon_tools/icontools/xce_limarea, this script performs the
steps described in the following two sections. The xce limarea script can be submitted
to the PBS batch system of the Cray XC 40.
Driving Model
initial
data
data
ExtPar
external
parameters ICON-LAM
Figure 2.6.: Preprocessing steps for the limited area model ICON-LAM. The grid gener-
ation process is described in Sections 2.1 – 2.2. The initial data processing is
covered by Section 2.3.3. Finally, the script xce limarea for extracting the
boundary data is described in Section 2.3.4.
Important note: The 3D field HHL field (geometric height of model half levels above
mean sea level) is constant data. However, due to technical reasons, HHL is required
in every boundary data file.
In the first step the above ICON-LAM preprocessing script creates an auxiliary grid file
which contains only the cells of the boundary zone. This step needs to be performed only
once before generating the boundary data.
We use the iconsub program from the collection of ICON Tools, see Section 1.3, with the
following namelist:
&iconsub_nml
grid_filename = "grid_file.nc",
output_type = 4,
lwrite_grid = .TRUE.,
/
&subarea_nml
ORDER = "grid_file_lbc.nc",
grf_info_file = "grid_file.nc",
min_refin_c_ctrl = 1
max_refin_c_ctrl = 14
/
Then running the iconsub tool creates a grid file grid file lbc.nc for the boundary
strip. The cells in this boundary zone are identified by their value in a special meta-data
field, the refin c ctrl index, e. g. refin c ctrl = 1,...,14, see Figure 2.7.
Any of the data sources explained in the Sections 2.3.1 and 2.3.2 can be chosen for the
extraction of boundary data. To be more precise, it is possible to extract boundary data
from ICON, IFS, and COSMO-DE forecasts.
To this end, we define the following namelist for the iconremap program from the collection
of ICON Tools. This happens automatically in our ICON-LAM preprocessing script:
&remap_nml
in_grid_filename = "input_grid_file"
in_filename = "input_data_file"
in_type = 2
out_grid_filename = "grid_file_lbc.nc"
out_filename = "data_file_lbc.nc"
out_type = 2
out_filetype = 4
/
Figure 2.7.: Illustration of the ICON-LAM boundary zone. The cells in this boundary zone
are identified by their refin c ctrl index, e. g. refin c ctrl = 1,...,14.
Here, the parameters in type=2 and out type=2 specify that both grids correspond to
triangular ICON meshes (in grid filename and out grid filename). Additionally, a
namelist input field nml is appended for each of the preprocessed variables.
With respect to the output filename data file lbc.nc it is a good idea to follow a
consistent naming convention. See Section 5.1.4 on the corresponding namelist setup of
the ICON model.
Note that the input data file must contain only a single time step when running the
iconremap tool. The iconremap tool therefore must be executed repeatedly in order to
process the whole list of boundary data samples (this is automatically done within the
xce limarea script).
In this context the following technical detail may considerably speed up the preprocess-
ing: The iconremap tool allows to store and load interpolation weights to and from a
NetCDF file. When setting the namelist parameter ncstorage file (character string) in
the iconremap namelist remap nml, the remapping weights are loaded from a file with this
name. If this file does not exist, the weights are created from scratch and then stored for
later use. Note that for MPI-parallel runs of the iconremap tool multiple files are created.
Re-runs require exactly the same number of processes.
2.4. Exercises
In this exercise, you will deal with the necessary preparatory steps for performing global
real-case ICON runs.
Download of global data: Retrieve the necessary grids and external parameter files
from the ICON download server (see Section 2.1.3).
• Pick the R2B06 grid no. 24 from the list and right-click on the hyperlink for
the grid file, then choose ”copy link location”.
• Open a terminal window, login into the Linux cluster lce, and change into the
input subdirectory of case2. Download the grid file into this subdirectory by
typing wget link location -e http-proxy=ofsquid.dwd.de:8080.
• Repeat the last steps for downloading the grid connectivity information (see
Section 2.1.1 for explanation) and for the R2B07 N02 grid no. 28 (EU-nest).
• Pick the NetCDF version of the most recent ExtPar (external parameter) file
in the browser list (creation date 2015-08-05) and perform the previous steps
in order to download this file as well. Watch out that the ExtPar file matches
with the grid, i. e. make sure that both filenames contain the same
number (24). The ExtPar data file must have the tile suffix. Repeat this step
for the ExtPar file that matches with the grid no. 28.
• Download the R2B05 grid no. 23 together with its grid connectivity data. This
is the reduced (coarser) grid for radiation.
• Take a look at the grid data and the external parameters using the ncdump
utility, see Section 7.1.1 for details. Find out whether the external parameter
fields are defined at the grid vertices, edge midpoints or cells.
• Export the environment variable GRIB DEFINITION PATH with the setting
GRIB_DEFINITION_PATH=$GRIB_SAMPLES_PATH/../definitions
and run grib ls once more. Are there any differences wrt. the displayed short
names? See Section 1.1.2 for an explanation.
Prepare the initial data for ICON to start from an IFS analysis.
EX 2.3
• Start the mars4icon script and download an IFS analysis file for the date
2017-01-12T00:00:00, see Section 2.3.2.
• Interpolate the data horizontally from the lat/lon grid onto the triangular
ICON grid using iconremap. The run script case4/xce ifs2icon.run will do
this job. Fill in the name of the IFS file by setting in grid filename and
in filename. Further help on iconremap can be found in Section 2.3.2.
• List the fields contained in the output file using cdo and/or ncdump. Compare
this to Table 2.3. Besides, find out which field(s) are not defined on cell
circumcenters.
Note:
The following exercise requires a number of data files which are produced as model
output in the real data exercise, Ex. 4.5.
The subdirectory case3 contains a script xce remap inidata which will do the
remapping.
• adapt the path to the input data file (directory “lam forcing” generated by
real case run).
Inspect the local (TARGET) grid file and the grid which was used in Ex. 4.5 for
creating the initial data (SOURCE).
• If the source grid has a horizontal resolution of ≈ 20 km, what is the resolution
of your local (TARGET) grid in km, given the RxBy expessions identified above
(see Eq. 2.1)?
– Answer: TARGET grid res. km
• Check the result: The remapping script should have created a file with prefix
init in case3/output.
• The subdirectory case3 contains a copy of the DWD ICON Tools script
xce limarea, see Section 2.3.4. Open this script and
– adapt the path to input data files (directory “lam forcing” generated by
real case run).
• Check the result: Visualize the boundary data with the NCL script
case3/plot boundary data.ncl.
• Investigate the files: How many cells are contained in the boundary grid?
– Answer: cells
• When looking at the set of boundary data variables, do you have any idea for
further improvement in terms of storage space?
Table 2.3.: Mandatory IFS analysis fields on a regular lat-lon grid, as retrieved by the
script mars4icon smi.
As opposed to real-case runs, idealized test cases typically do not require any external
parameter or analysis fields for initialization. Instead, all initial conditions are computed
within the ICON model itself, based on analytical functions. These are either evaluated
pointwise at cell centers, edges, or vertices, or are integrated over the triangular control
volume.
The ability to run idealized model setups serves as a simple possibility to test the cor-
rectness of particular aspects of the model, either by comparison with analytic reference
solutions (if they exist), or by comparison with results from other models. Beyond that,
idealized test cases may help the scientist to focus on specific atmospheric processes.
In general, the ICON model is controlled by a so-called parameter file which uses Fortran
NAMELIST syntax. Default values are set for all parameters, so that you only have to
specify values that differ from the default.
Assuming that ICON has been compiled successfully, the next step is to adapt these
ICON namelists. Discussing all available namelist switches is definitely beyond the scope
of this tutorial. We will merely focus on the particular subset of namelist switches that
is necessary to setup an idealized model run. A complete list of namelist switches can be
found in the namelist documentation
icon-dev/doc/Namelist_overview.pdf
ICON provides a set of pre-implemented test cases of varying complexity and focus, ranging
from pure dynamical core and transport test cases to “moist” cases, including microphysics
and potentially other parameterizations. A complete list of available test cases can also be
found in the namelist documentation, mentioned above.
Individual test cases can be selected and configured by namelist parameters of the namelist
nh testcase nml. To run one of the implemented test cases, only a horizontal grid file has
to be provided as input. A vertical grid file containing the height distribution of vertical
model levels is usually not required, since the vertical grid is constructed within the ICON
model itself, based on a set of namelist parameters described below.
such that coordinate surfaces become constant height surfaces above z = zf lat . Sometimes,
Equation (3.1) is also written in the discretized form
where k denotes the vertical level index and zh is the half level height.
In that case the user has to provide the vertical coordinate table (vct) as an input file.
The table consists of the A and B values (see Equation (3.2)) from which the half level
heights zh (x, y, k) can be deduced. A(k)[m] are fixed height values, with A(1) defining the
model top height H and A(nlev + 1) = 0 m. The dimensionless values B(k) control the
vertical decay of the topography signal, with B(1) = 0 and B(nlev + 1) = 1. Thus, at
each horizontal grid point zh (x, y, 1) is the model top height, while zh (x, y, nlev + 1) is the
surface height.
The structure of the expected input file is depicted in Table 3.1. Example files can be
found in icon-dev/vertical coord tables. The file must obey the following naming
rule: atm hyb sz [nlev], where [nlev] must be replaced by the total number of full levels.
ICON expects the file to be located in the base directory from which the model is started.
Note that the filename specification must not be confused with another parameter which
has a similar name, grid nml/vertical grid filename!
# File structure
# --------------
# A and B values are stored in arrays vct_a(k) and vct_b(k).
# The files in text format are structured as follows:
#
# -------------------------------------
# | k vct_a(k) [m] vct_b(k) [] | <- first line of file = header line
# | 1 A(1) B(1) | <- first line of A and B values
# | 2 A(2) B(2) |
# | 3 A(3) B(3) |
# | . |
# | . |
# | nlev+1 A(nlev+1) B(nlev+1)| <- last line of A and B values
# |=====================================| <- lines from here on are ignored
# |Source: | by mo_hyb_params:read_hyb_params
# |<some lines of text> |
# |Comments: |
# |<some lines of text> |
# |References: |
# |<some lines of text> |
# -------------------------------------
Table 3.1.: Structure of vertical coordinate table as expected by the ICON model.
In the case of a terrain-following hybrid Gal-Chen coordinate the influence of terrain on the
coordinate surfaces decays only linearly with height. The basic idea of the Smooth Level
Vertical SLEVE coordinate (Schär et al., 2002, Leuenberger et al., 2010) is to increase
the decay rate, by allowing smaller-scale terrain features to be removed more rapidly with
height. To this end, the topography h(x, y) is divided into two components
where h1 (x, y) denotes a smoothed representation of h(x, y), and h2 (x, y) = h(x, y) −
h1 (x, y) contains the smaller-scale contributions. The coordinate is then defined as
Different decay functions B1 and B2 are now chosen for the decay of the large- and small-
scale terrain features, respectively. These functions are selected such, that the influence
of small-scale terrain features on the coordinate surfaces decays much faster with height
than their large-scale (well-resolved) counterparts.
The vertical grid is constructed during the initialization phase of ICON, based on ad-
ditional parameters defined in sleve nml. Here we will only discuss the most relevant
parameters. For a full list, the reader is referred to the namelist documentation.
Note for advanced users: On default, a vertical stretching is applied such that co-
ordinate surfaces become non-equally distributed along the vertical, starting with a
minimum thickness of min lay thckn between the lowermost and second lowermost
half-level. If constant layer thicknesses are desired, min lay thckn must be set to a
value ≤ 0. The layer thickness is then determined as top height/num lev.
From the set of available idealized test cases we choose the Jablonowski-Williamson baro-
clinic wave test and walk through the procedure of configuring and running this test in
ICON.
The Jablonowski-Williamson baroclinic wave test (Jablonowski and Williamson, 2006) has
become one of the standard test cases for assessing the quality of dynamical cores. The
model is initialized with a balanced initial flow field. It comprises a zonally symmetric base
state with a jet in the midlatitudes of each hemisphere and a quasi realistic temperature
distribution. Overall, the conditions resemble the climatic state of a winter hemisphere.
This initial state is in hydrostatic and geostrophic balance, but is highly unstable with
Figure 3.1.: Surface Pressure and 850 hPa Temperature at day 9 for the Jablonowski-
Williamson test case on a global R2B5 grid.
To trigger the evolution of a baroclinic wave in the northern hemisphere, the initial condi-
tions are overlayed with a weak (and unbalanced) zonal wind perturbation. The perturba-
tion is centered at (20◦ E, 40◦ N). In general, the baroclinic wave starts growing observably
around day 4 and evolves rapidly thereafter with explosive cyclogenesis around model
day 8. After day 9, the wave train breaks (see Figure 3.1). If the integration is contin-
ued, additional instabilities become more and more apparent especially near the pentagon
points (see Section 2.1.1), which are an indication of spurious baroclinic instabilites trig-
gered by numerical discretization errors. In general, this test has the capability to assess
• the presence of phase speed errors in the advection of poorly resolved waves,
Figure 3.2.: Initial tracer distributions which are available for the Jablonowski-Williamson
test case. Tracer q3 only depends on the latitudinal position, and tracer q4 is
constant.
This section explains several namelist groups and main switches that are necessary for
setting up an idealized model run. Settings for the Jablonowski-Williamson test case are
given in red.
is switched off completely (pure dynamical core test case). This implies that all
physical parameterizations (see nwp phy nml) are switched off automatically. If set
to 3, dynamics are forced by NWP-specific parameterizations. Individual physical
processes can be controlled via nwp phy nml, see also Table 5.1. In general, the setting
of iforcing depends on the selected testcase.
While the above switches are necessary to specify the type of simulation (idealized vs.
real-case), nothing has been specified so far regarding the computational domain. ICON
has the capability for running
• global simulations with refined nests (so called patches or domains), and
Here, the use of nests requires an additional remark. The refined nests are tightly embedded
into the global simulation in the sense that the prognostic variables are synchronized in
every (dynamical-core) time step of the parent domain.
In a one-way nested simulation the prognostic fields (or, in the default setup: the re-
spective tendencies) are prolongated within the nest boundary region from the coarser
parent grid to the finer nest grid. They are incorporated into the next iteration of the
nest’s dynamical core, but the fine-scale variables do not influence the global state.
feedback
On the other hand, the result state of the nested grid may also be transferred back
to the coarser parent grid in a feedback loop. In ICON this is called a two-way nested
simulation.
Note for advanced users: Even when choosing different numbers of vertical levels,
vertical layers between the nested and the parent domain must match. Therefore,
the nested domain may only have a lower top level height!
In the following we will explain how the Jablonowski-Williamson test case can be set up
for a global domain only and, in a second step, for a global domain with nests.
Figure 3.3.: Location of available nests for the baroclinic wave test case. The perturbation
triggering the baroclinic wave is centered at (20◦ E, 40◦ N) (red circle).
Examples
Example 3: Settings for a global Jablonowski-Williamson test run including two nests
on the same nesting level (i.e. combination of Figure 3.3)
Some additional information on the grid file naming convention can be found in Sec-
tion 2.1.1.
The integration time step and simulation length are defined as follows:
Output is controlled by the namelist group output nml. It is possible to define more than
one output namelist and each output namelist has its own output file attached to it. For
example, the run script that is used in Exercise 3.1 contains three output namelists. The
details of the model output specification are discussed in Section 4.3.
3.4. Exercises
In this exercise you will learn how to set up idealized runs in ICON with and without
nested domains.
Job submission to the Cray XC 40 can be performed on the Linux cluster lce. Note,
however, that visualization tools (CDO, NCL, ncview) are only available on
the lce!
Preparations:
Retrieve the necessary grid file from the ICON download server (see Section 2.1.3):
EX 3.1 • Open the download page for the pre-defined ICON grids
http://icon-downloads.zmaw.de in your web browser.
• Pick the R2B05 grid no. 14 from the list and right-click on the hyperlink for
the grid file, then choose ”copy link location”.
• Open a terminal window, login into the Linux cluster lce, and change into the
subdirectory test cases/case1/input. Download the grid file into this
subdirectory by typing wget link location -e
http-proxy=ofsquid.dwd.de:8080.
• Change into the run script directory test cases/case1. The run script is
named run ICON R02B05 JW.
• Fill in the name of the ICON model binary (including the path) and several
missing namelist parameters. I.e. set ltestcase, ldynamics, iforcing,
nh test name and itopo. See Section 1.1.1 for the location of your ICON
model binary and Section 3.3.1 for additional help on the namelist parameters.
• Submit the job to the Cray XC 40, using the PBS command qsub.
• Go to the output directory of case1. You will find three NetCDF output files
with (a) model level output on the native (triangular) grid, (b) pressure level
output on the native grid, (c) model level output interpolated onto a regular
lat-lon grid.
• Have a closer look to the different output files and their internal structure to
find out which one is which. We suggest to use the tools cdo and ncdump as
described in Section 7.1.
What output interval (in h) has been used? – Answer: h
• Visualize the output with one of the tools described in Section 7. An NCL
script named JW plot.ncl is available in test cases/case1.
• Compare the output of the NCL script with the reference JABW DOM01.ps
given in subdirectory reference.
• Go to the run script directory. The run script for a nested grid experiment is
EX 3.2
termed run ICON R02B05N6 JW.
• Fill in the name of your ICON model binary and missing namelist parameters.
I.e. extend atmo dyn grids, num lev (run nml) and dynamics parent grid id
(grid nml), see Section 3.3.3 for additional help. The same output fields as for
the global domain will be generated for the regional domain(s).
The location of the prepared nests is depicted in Figure 3.3. Feel free to add
one or both nests to your global domain.
• After the job has finished, visualize the output on the global domain as well as
on nest(s) using one of the tools described in Section 7. When using NCL
(JW plot.ncl), you will have to adapt workdir and domain nr.
• Also compare the results on the global domain with those of the previous
exercise. Does the global domain output differ? If so, do you have an
explanation for this?
The JW test case provides a set of four pre-defined idealized tracer fields (see
Figure 3.2). Here, we will learn how to enable and control the transport of passive
tracers in idealized tests.
EX 3.3
• Enable tracer advection:
– Select one or more tracers from the set of pre-defined tracer distributions
(see Figure 3.2). A specific distribution can be selected by adding the
respective tracer number (1,2,3, or 4) to tracer inidist list (namelist
nh testcase nml, comma-separated list of integer values).
– Add the selected tracer fields to the list of output fields in the namelists
output nml (namelist parameters ml varlist and pl varlist).
For idealized testcases, tracer fields are named qx, where x is equal to the
suffix specified via the namelist variable tracer names (namelist
transport nml, comma-separated list of string parameters). The nth entry
in tracer names specifies the suffix for the nth entry in
tracer inidist list.
If nothing is specified, tracer fields are named qx, where x is a number
indicating the position of the tracer within the ICON-internal 4D tracer
container.
• Visualization: Enable lplot transport in your NCL script. You can also have
a quick look using ncview.
In this lesson you will learn about the ICON time-stepping and how to initialize and run
the ICON model in a realistic NWP setup. Data provided by DWD’s Data Assimilation
Coding Environment (DACE) will serve as initial conditions. Before we conclude this
chapter with some exercises, the different mechanisms for parallel execution of the ICON
model will be discussed.
For efficiency reasons, different integration time steps are applied depending on the process
under consideration. In the following, the term dynamical core refers to the numerical
solution of the dry Navier-Stokes equations, while the term physics refers to the diabatic,
mostly sub-grid scale, processes that have to be parameterized. In ICON, the following
time steps have to be distinguished:
∆t the basic time step specified via namelist variable dtime, which is used
for tracer transport, numerical diffusion and the fast-physics parameter-
izations.
∆τ the short time step used within the dynamical core; the ratio be-
tween ∆t and ∆τ is specified via the namelist variable ndyn substeps
(namelist nonhydrostatic nml, number of dynamics substeps), which
has a default value of 5.
∆ti,slow physics the process dependent slow physics time steps; they should be integer
multiples of ∆t and are rounded up automatically if they are not.
An illustration of the relationship between the time steps can be found in Figure 4.1. More
details on the physics-dynamics coupling will be presented in Section 5.2.2.
ICON solves the fully compressible nonhydrostatic Navier-Stokes equations using a time
stepping scheme that is explicit except for the terms describing vertical sound wave propa-
gation. Thus, the maximum allowable time step ∆τ for solving the momentum, continuity
and thermodynamic equations is determined by the fastest wave in the system - the sound
waves. As a rule of thumb, the maximum dynamics time step can be computed as
s
∆τ = 1.8 · 10−3 ∆x , (4.1)
m
where ∆x is the effective horizontal mesh size in meters (see Equation (2.1)). This implies
that the namelist variable dtime should have a value of
s
∆t = 9 · 10−3 ∆x ,
m
simulation time
t
= t advection
radiation
convection
Figure 4.1.: ICON internal time stepping. Sub-cycling of dynamics with respect to trans-
port, fast-physics, and slow-physics. ∆t denotes the time step for transport
and fast physics and ∆τ denotes the short time step of the dynamical core.
The time step for slow-physics can be chosen individually for each process.
Details of the physics-dynamics coupling will be discussed in Chapter 5.2.
Historical remark: Note that historically, ∆τ rather than ∆t was used as basic
control variable specified in the namelist, as appears logical from the fact that a
universal rule for the length of the time step exists for ∆τ only. This was changed
shortly before the operational introduction of ICON because it turned out that
an adaptive reduction of ∆τ is needed in rare cases with very large orographic
gravity waves in order to avoid numerical instabilities. To avoid interferences with
the output time control, the long time step ∆t was then taken to be the basic
control variable, which always remains unchanged during a model integration. The
adaptive reduction of ∆τ is now accomplished by increasing the time step ratio
ndyn substeps automatically up to a value of 8 if the Courant number for vertical
advection grows too large.
Time step for nested domains: In case of nested setups, the time step ∆t needs
to be specified for the global domain only. The adaption for nested regions is done
automatically, by multiplying ∆t with a factor of 0.5 for each nesting level. This
factor is hard-coded.
Additional time step restrictions for ∆t arise from the numerical stability of the horizontal
transport scheme and the physics parameterizations, in particular due to the explicit
coupling between the turbulent vertical diffusion and the surface scheme. Experience shows
that ∆t should not significantly exceed 1000 s, which becomes relevant when ∆x is larger
than about 125 km.
Even longer time steps than ∆t can be used for the so-called slow-physics parameteriza-
tions, i.e. radiation, convection, non-orographic gravity wave drag, and orographic gravity
wave drag. These parameterizations provide tendencies to the dynamical core, allowing
them to be called individually at user-specified time steps. The related namelist switches
are dt rad, dt conv, dt gwd and dt sso in nwp phy nml. If the slow-physics time step
is not a multiple of the advective time step, it is automatically rounded up to the next
advective time step. A further recommendation is that dt rad should be an integer mul-
tiple of dt conv, such that radiation and convection are called at the same time. The
time-splitting is schematically depicted in Figure 4.1.
The necessary input data to perform a real data run have already been described in
Chapter 2. These include
• grid files, containing the horizontal grid information,
• external parameter files, providing information about the earth’s soil and land prop-
erties, as well as climatologies of atmospheric aerosols,
• initial data (analysis) for atmosphere and land,
• boundary data in the case of limited area runs.
ICON is capable of reading analysis data from various sources (see Section 2.3), including
data sets generated by DWD’s Data Assimilation Coding Environment (DACE) and inter-
polated IFS data. Boundary data for limited area runs can be taken from forecasts/analysis
of the latter two models as well as from COSMO-DE. In the following we provide some
guidance on the basic namelist settings for real data runs, according to the data set at
hand and chosen initialization mode.
Most of the main switches, that were used for setting up idealized test cases, are also
important for setting up real data runs. Many of them have already been discussed in
Section 3.1, so we will mainly concentrate on their settings for real data runs. As before,
settings appropriate for the exercises in this chapter are given in red.
For real case runs, it is important that the user specifies the correct start date and time
of the simulation. This is done with ini datetime string using the ISO8601 format.
ini datetime string = YYYY-MM-DDThh:mm:ssZ (namelist time nml)
— This must exactly match the validity time of the analysis data set!
Wrong settings lead to incorrect solar zenith angles and wrong external parameter fields.
Setting the end date and time of the simulation via end datetime string is optional. If
end datetime string is not set, the user has to set the number of time steps explicitly
in nsteps (run nml), which is otherwise computed automatically.
extpar_filename = "extpar_<gridfile>.nc"
The keyword <gridfile> is automatically replaced by ICON with the grid filename
specified for the given domain (dynamics grid filename). As opposed to the grid-
file specification namelist variables (see above), it is not allowed to provide a comma-
separated list. Instead, the usage of keywords provides full flexibility for defining the
filename structure. See also Section 4.2.2 for additional keywords.
As mentioned previously, ICON allows for different real data initialization modes. The
mode in use is controlled via the namelist switch init mode.
init mode (namelist initicon nml, integer value)
It is possible to
• start from DWD analysis without the incremental analysis update (IAU) pro-
cedure (see Section 2.3.1): init mode = 1
• start from interpolated IFS analysis (see Section 2.3.2): init mode = 2
• start atmosphere from interpolated IFS analysis and soil/surface from interpo-
lated ICON/GME fields: init mode = 3
• start from DWD analysis, and make use of the IAU procedure to reduce initial
noise (see Section 2.3.1): init mode = 5
• The initialization modes init mode = 4 and init mode = 7 start from inter-
polated COSMO-DE or ICON/IFS forecasts for limited area runs. Limited area
simulations are a feature which will be addressed in Chapter 5.
The most relevant modes are modes 1, 2, 5 and 7. Modes 1, 2 and 5 are relevant for global
ICON simulations with and without nests. They will be explained in more detail below.
ICON supports NetCDF and GRIB2 as input format for the DWD input fields. In this
context it is important to note that the field names that are used in the GRIB2 input
files do not necessarily coincide with the field names that are internally used by the ICON
model. To address this problem, an additional input text file is provided, a so-called
dictionary file. This file translates between the ICON variable names and the corresponding
DWD GRIB2 short names.
Generally the dictionary is provided via the following namelist parameter:
Given that a valid DWD analysis data set for non-incremental update is available (see
Section 2.3.1), starting from DWD analysis data is basically controlled by the following
three namelist parameters:
In that case the ICON model expects two input files per domain: One containing the
ICON first guess (3h forecast) fields, which served as background fields for the assimilation
process. The second one contains the analysis fields produced by the assimilation process.
Remember to make sure that the validity date for the first guess and analysis input file is
the same and matches the model start date given by ini datetime string.
Input filenames need to be specified unambiguously, of course. By default, if the user does
not provide namelist settings for dwdfg filename and dwdana filename, the filenames
have the form
dwdfg_filename = "dwdFG_R<nroot>B<jlev>_DOM<idom>.nc"
dwdana_filename = "dwdana_R<nroot>B<jlev>_DOM<idom>.nc"
This means, e. g., that the first guess filename begins with “dwdFG ”, supplemented by
the grid resolution Rx Byy and the domain number DOMii . Filenames are treated case
sensitive.1
By changing the above setting, the user has full flexibility with respect to the
filename structure. The following keywords are allowed:
As described in Section 2.3.1, incremental analysis update is a means to reduce the initial
noise which typically results from small scale non-balanced modes in the analysis data
set. Given that a valid DWD analysis data set for incremental update is available (see
Section 2.3.1), starting from DWD analysis data is basically controlled by the following
namelist parameters:
1
More precisely this behaviour depends on the file system: UNIX-like file systems are case sensitive, but
the HFS+ Mac file system (usually) is not.
ICON again expects two input files: One containing the ICON first guess, which typically
consists of a 1.5 h forecast taken from the assimilation cycle (as opposed to a 3 h forecast
used for the non-IAU case). The second one contains the analysis fields (mostly increments)
produced by the assimilation process.
The behaviour of the IAU procedure is controlled via the namelist switches dt iau and
dt shift:
dt iau =10800 (namelist initicon nml, real value)
Time interval (in s) during which the IAU procedure (i.e. dribbling of analysis in-
crements) is performed.
As explained in Section 2.3.1 and depicted in Figure 4.2, you have to make sure that the
first guess is shifted ahead in time by −0.5 dt iau w.r.t. the analysis. The model start
time ini datetime string must match the validity time of the analysis.
No filtering procedure is currently available when starting off from interpolated IFS anal-
ysis data. The model just reads in the initial data from a single file and starts the forecast.
Remember to make sure that the model start time given by ini datetime string matches
the validity date of the analysis input file.
Figure 4.2.: Schematic illustrating typical settings for a global ICON forecast run starting
from a DWD analysis with IAU at 00UTC. IAU is performed over a 3 h time
interval (dt iau), with the model start being shifted ahead of the nominal
start date by 1.5 h (dt shift).
Model output is enabled via the namelist run nml with the main switch output. By setting
this string parameter to the value "nml", the output files and the fields requested for output
can be specified. In the following, this procedure will be described in more detail.
In general the user has to specify five individual quantities to generate output of the model.
These are:
All of these parameters are set in the namelist output nml. Multiple instances of this
namelist may be specified for a single model run, where each output nml creates a separate
output file. The options d) and e) require an interpolation step. They will be discussed in
more detail in Section 6.3.
In the following, we give a short explanation for the most important namelist parameters:
Users can also specify the variable names in a different naming scheme, for exam-
ple “T” instead of “temp”. To this end, a translation table (a two-column ASCII
file) can be provided via the parameter output nml dict in the namelist io nml.
An example for such a dictionary file can be found in the source code directory:
run/dict.output.dwd.
As it has been stated before, each output nml creates a separate output file. To be more
precise, there are a couple of exceptions to this rule. First, multiple time steps can be
stored in a single output file, but they may also be split up over a sequence of files (with a
corresponding index in the filename), cf. the namelist parameter steps per file. Second,
an instance of output nml may also create more than one output file, if grid nests have been
enabled in the model run together with the global model grid, cf. the namelist parameter
dom. In this case, each model domain is written to a separate output file. Finally, model
output is often written on different vertical axes, e. g. on model levels and on pressure
levels. The specification of this output then differs only in the settings for the vertical
interpolation. Therefore it is often convenient to specifiy the vertical interpolation in the
same output nml as the model level output, which again leads to multiple output files.
As mentioned in the introduction, ICON can use two different mechanisms for parallel
execution:
a) OpenMP – Multiple threads are run in a single process and share the memory of a
single machine.
An implementation of OpenMP ships with your Fortran compiler. OpenMP-parallel
execution therefore does not require the installation of additional libraries.
b) MPI – Multiple ICON processes (processing elements, PEs) are started simulta-
neously and communicate by passing messages over the network. Each process is
assigned a part of the grid to process.
These mechanisms are not mutually exclusive. A hybrid approach is also possible: Mul-
tiple ICON processes are started, each of which starts multiple threads. The processes
communicate using MPI. The threads communicate using OpenMP.
Worker PEs this is the majority of MPI tasks, doing the actual work
I/O PEs dedicated I/O server tasks
Restart PEs for asynchronous restart writing (see Section 6.4)
Prefetch PE for asynchronous read-in of boundary data in limited area mode
(see Section 5.1.5)
Test PE MPI task for verification of MPI parallelization (debug option)
The configuration settings are defined in the namelist parallel nml. To specify the
number of output processes, set the namelist parameter num io procs to a value
larger than 0, which reserves a number of processors for output. While writing, the
remaining processors continuously carry out calculations. Conversely, setting this
option to 0 forces the worker PEs to wait until output is finished. For the writing
of the restart checkpoints (see Section 6.4), there exists a corresponding namelist
parameter num restart procs.
During start-up the model prints out a summary of the processor partitioning. This
is often helpful to identify performance bottlenecks. First of all, the model log output
contains a one-line status message:
Afterwards, the sizes of grid partitions for each MPI process are summarized as
follows:
Number of compute PEs used for this grid: 118
# prognostic cells: max/min/avg xxx xxx xxx
Given the case that the partitioning process would fail, these (and the subsequently
printed) values would be grossly out of balance.
Increasing the number of nodes allows to use more computational resources, since a
single compute node comprises only a limited number of PEs and OpenMP threads.
On the other hand, off-node communication is usually more expensive in terms of
runtime performance.
When using the qsub command to submit a script file, the queuing system PBSPro
allows for specification of options at the beginning of the file prefaced by the
#PBS delimiter followed by PBS commands (see also the comments in Appendix A).
For example, to run the executable in hybrid mode on 12 nodes with 4 OpenMP
threads/processes, set
#PBS -q xc_norm_h
#PBS -l select=12:ompthreads=4
#PBS -l place=scatter
#PBS -l walltime=01:00:00
#PBS -j oe
Best practice for parallel setups (note for advanced users): ICON employs both
distributed memory parallelization and shared memory parallelization, i.e. a “hybrid
parallelization”. Only the former type actually performs a decomposition of the
domain data, using the de-facto standard MPI. The shared memory parallelization,
on the other hand, uses OpenMP directives in the source code. In fact, nearly all DO
loops that iterate over grid cells are preceded by OpenMP directives. For reasons of
cache efficiency the DO loops over grid cells, edges, and vertices are organized in two
nested loops: “jb loops” and “jc loops”1 . Here the outer loop (“jb”) is parallelized
with OpenMP.
There is no straight-forward way to determine the optimal hybrid setup, except for
the extreme cases: If only a single node is used, then the global memory facilitates
a pure OpenMP parallelization. Usually, this setup is only feasible for very small
simulations. If, on the other hand, each node constitutes a single-core system, a
multi-threaded (OpenMP) run would not make much sense, since multiple threads
would interfere on this single core. A pure MPI setup would be the best choice then.
In all of the other cases, the parallelization setup depends on the hardware platform
and on the simulation size. In practice, 4 threads/MPI task have proven to be a good
choice on Intel-based systems. This should be combined with the hyperthreading
feature, i.e. a feature of the x86 architecture where one physical core behaves like
two virtual cores.
Starting from this number of threads per task the total number of MPI tasks is
then chosen such that each node is used to an equal extent and the desired time-
to-solution is attained – in operational runs at DWD this is ∼ 1h. In general one
should take care of the fact that the number of OpenMP threads evenly divides
the number of cores per CPU socket, otherwise intersocket communication might
impede the performance.
Finally, there is one special case: If an ICON run turns out to consume an extraor-
dinarily large amount of memory (which should not be the case for a model with
a decent memory scaling), then the user can resort to “investing” more OpenMP
threads than it is necessary for the runtime performance. Doing so, each MPI pro-
cess would have more memory at its disposal.
1
This implementation method is known as loop tiling, see also Section 5.3.
4.4.2. Bit-Reproducibility
Bit-reproducibility refers to the feature that running the same binary multiple times should
ideally result in bitwise identical results. Depending on the compiler and the compiler
flags used this is not always true if the number of MPI tasks and/or OpenMP threads
is changed in between. Usually compilers provide options for creating a binary that of-
fers bit-reproducibility, however this is often payed dearly by strong performance losses.
With the Cray compiler, it is however possible to generate an ICON binary offering bit-
reproducibility with only little performance loss. The ICON binary used in this workshop
gives bit-reproducible results (will be checked in Exercise 4.7).
• for checking the MPI/OpenMP parallelization of the code. If the ICON code does
not give bit-identical results when running the same configuration multiple times,
this is a strong hint for an OpenMP race condition. If the results change only when
changing the processor configuration, this is a hint for a MPI parallelization bug.
• for checking the correctness of new code that is supposed not to change the results.
The ICON code contains internal routines for performance logging for different parts
(setup, physics, dynamics, I/O) of the code. These may help to identify performance
bottlenecks. ICON performance logging provides timers via the two namelist parameters
ltimer and timers level (namelist run nml).
With the following settings in the namelist run nml,
ltimer = .TRUE.
timers_level = 10
the user gets a sufficiently detailed output of wall clock measurements for different parts
of the code:
Note that some of the internal performance timers are nested, e.g. the timer log for
radiation is contained in physics, indicated by the “L” symbol. For correct interpreta-
tion of the timing output and computation of partial sums one has to take this hierarchy
into account.
Note for advanced users: The built-in timer output is rather non-intrusive. It is
therefore advisable to have it enabled also in operational runs.
4.5. Exercises
In this exercise you will learn how to start ICON from DWD analysis data and how to
perform a multi-day forecast with 40 km resolution globally and 20 km resolution over
Europe. Another practical aim of this exercise is to create raw data for driving a limited
area ICON run. Further use of this data will be made in Ex. 5.1.
Job submission to the Cray XC 40 can be performed on the Linux cluster lce. Note,
however, that visualization tools (CDO, NCL, ncview) are only available on
the Linux cluster lce!
Input Data
Note:
The exercises in this section require a number of grid files, external parameters
and input data. This data is already available in the directory case2/input. Their
creation process is explained in Ex. 2.1 and 2.2 (see p. 32).
On default, ICON expects the input data to be located in the experiment directory (termed
$EXPDIR in the run scripts). The run script creates symbolic links in the experiment
directory, which point to the input files. Table 4.1 provides a list of all input files in
the input directory (left column) together with the corresponding symbolic links in the
experiment directory (right column). Note that, in general, the original names differ from
the symbolic link names which need to match the default filename structure expected by
the model.
extpar files
icon extpar 0024 R02B06 G 20150805 tiles.nc −→ extpar DOM01.nc
icon extpar 0028 R02B07 N02 20150805 tiles.nc −→ extpar DOM02.nc
Table 4.1.: List of all input files required for Exercises 4.1–4.5. The left column shows the
original filenames, as found in the input directory case2/input, whereas the
right column shows the corresponding symbolic link names in the experiment
directory. With the symbolic links we avoid absolute path settings in the ICON
namelists.
In this exercise you will learn how to run ICON in real data mode.
Open the ICON run script case2/run ICON R02B06 dwdini and prepare the script
for running a global 72 hour forecast on an R2B06 grid with 40 km horizontal
resolution without nest. The start date is EX 4.1
Basic settings
• The grid file(s) to be used are already specified in the run script (see
dynamics grid filename, radiation grid filename).
• Due to the appropriate choice of the symbolic links in the experiment directory
(see Table 4.1), ICON is able to locate the first guess and analysis data sets
However, both are given in the run script for reference, using the keyword
nomenclature mentioned in Section 4.2.2.
Have a look at these settings and try to understand how these keywords work.
• Choose the appropriate initialization mode init mode for starting the model
from DWD analysis with IAU (see Section 4.2.3).
– Take a look at the output files in case2/output. You should find two files
named NWP ... Use cdo sinfov to identify the
time interval between two outputs – Answer: h
total time interval for which output is written – Answer: h
type of vertical output grid (ML, PL, HL) – Answer:
– one file contains output on the native ICON grid, the other one output on
a regular lat/lon grid. Use cdo sinfov to identify which file is which.
– Answer: native
– Answer: lat/lon
Timestepping
This exercise focuses on aspects of the ICON time-stepping scheme, explained in Sec-
tion 4.1.
• Compute the dynamics time step ∆τ from the specification of the physics time
step ∆t (dtime) and the number of dynamics substeps ndyn substeps.
EX 4.2
– Answer: ∆τ = s
• Take a look at Equation (4.1) and calculate an estimate for the maximum
dynamic time step which is allowed for the horizontal resolution at hand.
– Answer: ∆τmax = s
Now compare this to the time step used: Did we make a reasonable choice?
In this exercise you will learn how to specify the details of the parallel execution. We will
use the ICON timer module for basic performance measuring.
• Open your run script from the previous Exercise 4.1 and enable the ICON
EX 4.3
routines for performance logging (timers). To do so, follow the instructions in
Section 4.4.3.
• At the end of the model run, a log file is created. It can be found in your base
directory case2. Scroll to the end of this file. You should find wall clock timer
output comparable to that listed in Section 4.4.3. Try to identify
the total run time – Answer: s
the time needed by the radiation module – Answer: s
Changing the number of MPI tasks: In case your computational resources are
sufficient, one possibility to speed up your model run is to increase the number of
MPI tasks. EX 4.4
• Create a copy of your run script run ICON R02B06 dwdini named
run ICON R02B06 dwdini fast. In order to avoid overwriting your old results,
replace the output directory name (EXPDIR) by exp02 dwdini fast.
• Double the total number of MPI tasks compared to your previous job and
re-submit. In more detail, do the following:
– Your old script (run ICON R02B06 dwdini) ran the executable in hybrid
mode on 25 nodes using 12 MPI tasks/node and 4 OpenMP threads/MPI
task with hyperthreading enabled.
– Your new script (run ICON R02B06 dwdini fast) should run the
executable in hybrid mode on 50 nodes using 12 MPI tasks/node and 4
OpenMP threads/MPI task with hyperthreading enabled.
You need to adjust both the PBS settings and the aprun command. See
Section 4.4.3 for additional help.
• Compute the speedup that you gained from doubling the number of MPI tasks.
– Compare the timer output of the dynamical core, nh solve, and the
transport module, transport, with the timer output of your previous run.
What do you think is a more sensible measure of the effective cost: total
min rank or total max rank?
– Answer:
– Which speedup did you achieve and what would you expect from
“theory”?
T25nodes
– Answer: Speedup achieved = T50nodes =
– Answer: Speedup expected =
In this exercise you will learn about some of the output capabilities of ICON. We will set
up a new output namelist and generate a data set which enables us to drive a limited area
version of ICON in Chapter 5.
R2B07
R2B06
Figure 4.3.: Computational grid with a two-way nested region over Europe (yellow shad-
ing). The outline of the COSMO-EU domain (formerly used operationally by
DWD) is shown in red for comparison.
• In order to save some computational resources, the nested region should have a
reduced model top height and comprise only the lowermost 60 vertical levels of
the global domain (instead of 90 levels). Please extend num lev (run nml)
accordingly.
• The run script contains two commented-out output namelists. Activate the
namelists and fill in the missing parameters. See Section 4.3 for additional
details regarding output namelists. Output should be written
– in GRIB2 format
• You should find 25 files in your output directory “lam forcing”. Apply the
command cdo sinfov data-file.grb > data-file.sinfov to the last file. Compare
with the reference output in Table 4.2 to see whether your output namelist is
correct.
Table 4.2.: Reference output for Ex. 4.5. Structure and content of file
forcing DOM02 20170114T000000Z.grb
Depending on the data set used to initialize the model, spin up effects may become visible
during the first few hours of a forecast. This is especially true, if third party (i.e. non-
native) analysis data sets are used. Here we will have a look into the spin up behaviour
when ICON is started from native vs. non-native analysis. As an example for a popular
non-native analysis, we will choose data from the IFS.
• Run the NCL script case2/water budget.ncl, to get a deeper insight into the
model’s spin up properties and water budget when started from DWD analysis.
The script generates time series of vertically integrated water vapour tqv and EX 4.6
condensate tqx from the model level output in case2/output/exp02 dwdini.
Compare the results to Figure 4.4 which shows the corresponding results when
ICON is started from IFS analysis.
The additional NCL plots will give you some insight into the global
atmospheric water budget
dQt
=P −E+R,
dt
where Qt is the vertically integrated atmospheric water content and dQt /dt is
the rate of change over time. P is the amount of total precipitation, E is the
total evaporation, and R is a residuum. Is the budget closed (i.e. is R = 0 in
your model run)?
• Compare the model level output of run ICON R02B06 ifsini with the output
that was produced by your modified script run ICON R02B06 ifsini fast.
You can use cdo infov data-file.nc for getting information about the contents EX 4.7
of your output file. You should dump this information into text files,
so that you can compare them later on. Due to the limited number of digits
printed by CDO, this is no check for bit-reproducibility in a strict
mathematical sense, however experience showed that cdo infov is very
reliable in revealing reproducibility issues.
Hint: For a convenient comparison of ASCII files you may use the tkdiff
utility.
Figure 4.4.: Time series of area averaged column integrated specific moisture htqvi (top)
and condensate classes htqxi (bottom) for a 7-day forecast started from IFS
analysis fields. Start date was 2017-01-12T00:00:00. A spin up in htqvi and
initial adjustments in htqxi are clearly visible.
The most important first: Running the limited area (regional) mode of ICON does not
require a separate, fundamentally different executable. Instead, ICON-LAM is quite similar
to the other model components discussed so far: It is easily enabled by a top-level namelist
switch
Other namelist settings must be added, of course, to make a proper ICON-LAM setup.
This chapter explains some of the details.
Chapter Layout. Some of the preprocessing aspects regarding the regional mode have
already been discussed in Section 2.3.4. Based on these prerequisites the exercises in this
chapter will explain how to actually set up and run limited area simulations.
Apart from these technical adjustments, a more detailed understanding of ICON’s physics-
dynamics coupling is a necessary starting point for actually modifying and extending the
ICON model. We will provide information on this more general subject in the following
sections as well. Finally, the reader will be able to implement own model diagnostics.
This section provides technical details on the limited area mode, in particular on how to
control the read-in of boundary data.
In Section 3.3.2 (see p. 43) the nesting capability of ICON has been explained. Technically,
the same computational grids may be used either for the limited area mode or the nested
mode of ICON1 . Furthermore, both ICON modes aim at simulations with finer grid spacing
and smaller scales. They therefore choose a comparable set of options out of the portfolio
of available physical parameterizations.
However, there exist some differences between the regional and the one-way nested mode:
1
Here, we do not take the reduced radiation grid feature into account, see Section 6.2. This serves to
simplify the discussion at this point.
Within the boundary zone, the driving boundary data is partly prescribed and partly
combined with the prognostic fields of the regional domain by taking a weighted mean of
the two.
Interpolation Nudging
grf_bdywidth=4 nudge_zone_width=8
domain domain
boundary interior
On the outermost four cell rows (grf bdywidth) the boundary data are simply interpo-
lated onto the domain. In the adjacent nudging zone the prognostic fields are nudged
towards the driving boundary data. The nudging weights are reduced with increasing dis-
tance from the boundary. The nudge zone width in terms of cell rows can be specified in
nudge zone width. It should at least comprise 8 (better 10) cell rows in order to minimize
boundary artefacts.
Limited area runs with ICON require new initialization modes, in addition to those de-
fined in Section 4.2.1. In these modes the read-in process will be followed by a vertical
interpolation of the input fields.
When we do not make use of additional analysis information, we need to set this via a
namelist option:
lread ana (namelist initicon nml, logical value)
By default, this namelist parameter is set to .TRUE.. If .FALSE., ICON is started
from first guess only and an analysis file is not required. The filename of the first
guess file is specified via the dwdfg filename namelist option, see Section 4.2.2.
Boundary data is read in regular time intervals. This is specified by the following namelist
parameter:
dtime latbc (namelist limarea nml, floating-point value)
Time difference in seconds between two consecutive boundary data sets.
Naturally, the sequence of lateral boundary data files must satisfy a consistent naming
scheme. It is a good idea to consider this convention already during the preprocessing
steps (see Section 2.3.4).
Field names: latbc varnames map file (namelist limarea nml, string)
ICON supports NetCDF and GRIB2 as input format for the DWD input fields. Since
often field names that are used, e.g., in the GRIB2 input files do not coincide with
the field names that are internally used by the ICON model, an additional input
text file (dictionary file) can be provided. This two-column file translates between
the ICON variable names and the corresponding DWD GRIB2 short names.
If no latbc varnames map file is specified, then it is assumed that all required
fields can be identified in the input files by their ICON-internal names.
5.2.1. Overview
For efficiency reasons, a distinction is made between so-called fast-physics processes (those
whose time scale is comparable or shorter than the model time step), and slow-physics
processes whose time scale is considered slow compared to the model time step. The
relationship between the different time steps has already been explained in Section 4.1.
Fast-physics processes are calculated at every physics time step and are treated with time
splitting (also known as sequential split) which means that (with exceptions noted be-
low) they act on an atmospheric state that has already been updated by the dynamical
core, horizontal diffusion and the tracer transport scheme. Each process then sequentially
updates the atmospheric variables and passes a new state to the subsequent parameteri-
zation.
Raschendorfer (2001)
Table 5.1.: Summary of ICON’s physics parameterizations, together with the related
namelist settings (namelist nwp phy nml). Note: Since the JSBACH compo-
nent is not available in NWP mode, it has been excluded from this list.
Figure 5.1.: Coupling of the dynamical core and the NWP physics package. Processes
declared as fast (slow) are treated in a time-split (process-split) manner.
time level because the surface variables are not updated in the dynamical core and the
surface transfer coefficients and fluxes would be calculated from inconsistent time levels
otherwise. The coupling strategy is schematically depicted in Figure 5.1.
Slow-physics processes are treated in a parallel-split manner, which means that they are
stepped foward in time independently of each other, starting from the model state provided
by the latest fast-physics process. In ICON convection, radiation, non-orographic and
orographic gravity wave drag are considered as slow processes. Typically, these processes
are integrated with time steps longer than the (fast) physics time step. The slow-physics
time steps can be specified by the user. The resulting slow-physics tendencies ∂vn /∂t,
∂T /∂t and ∂qx /∂t with x ∈ [v, c, i] are passed to the dynamical core and remain constant
between two successive calls of the parameterization (Figure 5.1). Since ICON solves a
prognostic equation for π rather than T , the temperature tendencies are converted into
tendencies of the Exner function, beforehand.
The physics-dynamics coupling in ICON differs from many existing atmospheric models
in that it is performed at constant density (volume) rather than constant pressure. This is
related to the fact that the total air density ρ is one of the prognostic variables, whereas
pressure is only diagnosed for parameterizations needing pressure as input variable. Thus,
it is natural to keep ρ constant in the physics-dynamics interface. As a consequence,
heating rates arising from latent heat release or radiative flux divergences have to be con-
verted into temperature changes using cv , the specific heat capacity at constant volume of
moist air. Some physics parameterizations inherited from hydrostatic models, in which the
physics-dynamics coupling always assumes constant pressure, therefore had to be adapted
appropriately.
Moreover, it is important to note that the diagnosed pressure entering into a variety of
parameterizations is a hydrostatically integrated pressure rather than a nonhydrostatic
pressure derived directly from the prognostic model variables2 . This is motivated by the
fact that the pressure is generally used in physics schemes to calculate the air mass repre-
sented by a model layer, and necessitated by the fact that sound waves generated by the
saturation adjustment can lead to a local pressure increase with height in very extreme
cases, particularly between the lowest and the second lowest model level.
Another important aspect is related to the fact that physics parameterizations traditionally
work on mass points (except for three-dimensional turbulence schemes). While the conver-
sion between different sets of thermodynamic variables is reversible except for numerical
truncation errors, the interpolation between velocity points and mass points potentially
induces errors. To minimize them, the velocity increments, rather than the full velocities,
coming from the turbulence scheme are interpolated back to the velocity points and then
added to the prognostic variable vn .
In the exercises at the end of this chapter, we will investigate ICON’s physical parame-
terizations by means of a custom diagnostic quantity. We restrict ourselves to the cloud
microphysics parameterization, where some additional background information will be of
interest:
Microphysical schemes provide a closed set of equations to calculate the formation and
evolution of condensed water in the atmosphere. The most simple schemes predict only the
specific mass content of certain hydrometeor categories like cloud water, rain water, cloud
ice and snow. This is often adequate, because it is sufficient to describe the hydrological
cycle and the surface rain rate, which is the vertical flux of the mass content. Microphysical
schemes of this category are called single-moment schemes.
2
Note that the (surface) pressure available for output is as well the hydrostatically integrated pressure
rather than a nonhydrostatic pressure derived directly from the prognostic model variables.
In ICON two single-moment schemes are available, one that predicts the categories cloud
water, rain water, cloud ice and snow (inwp gscp=1 in the namelist nwp phy nml), and one
that predicts in addition also a graupel category (inwp gscp=2). Graupel forms through
the collision of ice or snow particles with supercooled liquid drops, a process called riming.
Most microphysical processes depend strongly on particle size and although the mean size
is usually correlated with mass content this is not always the case. Schemes that predict
also the number concentrations have the advantage that they provide a size information,
which is independent from the mass content. Such schemes are called double-moment
schemes, because both, mass content and number concentration, are statistical moments
of the particles size distribution.
ICON does also provide a double-moment microphysics scheme (inwp gscp=4), which
predicts the specific mass and number concentrations of cloud water, rain water, cloud
ice, snow, graupel and hail. This scheme is most suitable at convection-permitting or
convection-resolving scales, i.e., mesh sizes of 3 km and finer. Only on such fine meshes
the dynamics is able to resolve the convective updrafts in which graupel and hail form.
On coarser grids the use of the double-moment scheme is not recommended.
To predict the evolution of the number concentrations the double-moment scheme includes
various parameterizations of nucleation processes and all relevant microphysical interac-
tions between these hydrometeor categories. Currently all choices regarding, e.g., cloud
condensation and ice nuclei, particle geometries and fall speeds etc. have to be set in the
code itself and can not be chosen via the ICON namelist.
A thorough description of how to modify the ICON model and implement one’s own
diagnostics would certainly be a chapter in its own right. Moreover, its scope would not be
limited to LAM applications. Here, we try to keep things as simple and short as possible
with a view to the subsequent exercises.
Adding new fields. ICON keeps so-called variable lists of its prognostic and diagnostic
fields. This global registry eases the task of memory (de-)allocation and organizes the
field’s meta-data, e.g., its dimensions, description and unit. The basic call for registering
a new variable is the add var command (module mo var list). Its list of arguments is
rather lengthy and we will discuss them step by step.
First, we need an appropriate variable list to which we can append our new variable. For
the sake of simplicity, we choose an existing diagnostic variable list, defined in the module
mo nonhydro state:
The corresponding type definition can be found in the module mo nonhydro types. There,
in the derived data type TYPE(t nh diag), we place a 2D variable pointer
From now on the new field can be specified in the output namelists that were described
in Section 4.3:
&output_nml
...
ml_varlist = ’newfield’
/
Looping over the grid points. Of course, the newly created field ’newfield’ still needs
to be filled with values and the dimensions of the 2D field have not yet been explained.
For reasons of cache efficiency nearly all DO loops that iterate over grid cells are organized
in two nested loops: “jb loops” and “jc loops”. Here the outer loop (“jb”) is parallelized
with OpenMP and limited by the cell block number nblks c. The innermost loop iterates
between 1 and nproma.
Note: Three-dimensional fields have an additional dimension for the column levels:
Since the ICON model is usually executed in parallel, we have to keep in mind that each
process can perform calculations only on a portion of the decomposed domain. Moreover,
some of the cells between interfacing processes are duplicates of cells from neighbouring
sub-domains (so-called halo cells). Often it is not necessary to loop over these points twice.
An auxiliary function get indices c helps to adjust the loop iteration accordingly:
i_startblk = p_patch(domain)%cells%start_block(grf_bdywidth_c+1)
i_endblk = p_patch(domain)%cells%end_block(min_rlcell_int)
DO jb = i_startblk, i_endblk
CALL get_indices_c(p_patch(domain), jb, i_startblk, i_endblk, is, ie, &
grf_bdywidth_c+1, min_rlcell_int)
DO jc = is, ie
p_nh_state(domain)%diag%newfield(jc,jb) = ...
END DO
END DO
The constants grf bdywidth c and min rlcell int can be found in the modules
mo impl constants grf and mo impl constants, respectively.
Placing the subroutine call. Having encapsulated the computation together with the DO
loops in a custom subroutine, we are ready to place this subroutine call in between ICON’s
physics-dynamics cycle.
Let us once more take a look at Figure 5.1: The outer loop “Dynamics → Physics →
Output” is contained in the core module mo nh stepping inside the TIME LOOP iteration.
Then, for diagnostic calculations it is important to have all necessary quantities available
for input. On the other hand the result must be ready before the call to the output module,
CALL write_name_list_output(jstep)
The fail-safe solution here is to place the call immediately above this call.
Having inserted the call to the diagnostic field computation, we are done with the final
step. Recompile the model code and you are finished!
Make sure that there is no duplicate functionality and try to improve the read-
ability of your subroutines through indentation, comments etc. This will make
it easier for other developers to understand and assimilate. Better introduce own
modules with complete interfaces and avoid USEs and PUBLIC fields.
5.4. Exercises
The exercises in this section require a number of grid files, external parameters and in-
put data. In particular, initial data and driving boundary data are required. Both were
produced in the real data exercise, Ex. 4.5. The preprocessing of this data is explained in
Ex. 2.4 (see p. 33).
Figure 5.2.: Illustration of the local grid used in Exercises 5.1–5.3. The horizontal resolu-
tion is ≈ 6.5 km (which corresponds to R3B8 in ICON nomenclature). The
boundary region is highlighted, where nudging towards the driving data is
performed. The driving data for this test case has been created in Ex. 2.4.
In this exercise we will run ICON in limited area mode. The model will be driven by initial
and boundary data which have been produced in Ex. 2.4.
Open the run script case3/run ICON R3B08 lam and prepare it for running a
48 hour forecast on a limited area grid over Germany (see Figure 5.2). As
for the global run, the start date is EX 5.1
The run script is geared up insofar as all softlinks which specify the model binary,
grids, initial and boundary data are already set. Your main task will be to set up
the ICON namelists for a limited area run.
Basic settings:
• Set the correct start and end date:
ini datetime string, end datetime string.
• Switch on the limited area mode by setting l limited area and init mode
(see Section 5.1.3).
Initial data:
• Specify the initial data file via dwdfg filename. Since we do not make use of
additional analysis information from a second data file, remember to set
lread ana accordingly (see Section 5.1.3).
Boundary data:
• Specify the lateral boundary grid and data via latbc boundary grid,
latbc path and latbc filename. For the latter you will have to make use of
the keywords <y>, <m> <d> and <h> (see Section 5.1.4).
• What is the time interval between two consecutive boundary data files?
The command cdo sinfov may be helpful.
– Answer: s
Set the namelist parameter dtime latbc accordingly.
• For the boundary data, set the number of vertical levels in nlev latbc. Is it
different from the number of vertical levels that is used by the model itself?
– Answer:
Running the model and inspecting the output
• Extend the lat-lon output namelist by the 2 m temperature, surface pressure,
mean sea level pressure, and 10 m gusts. See Appendix C for variable names
and description.
• Submit the job to the Cray XC 40.
• After the job has finished, inspect the model output. You should find multiple
files in the output directory case3/output/exp03 R3B08 dwdlam which
contain hourly output on model levels remapped to a regular lat-lon grid.
– Take a look at the output fields by using ncview. How would you
characterize the overall weather situation for that time period?
southfoehn over the alpine region
strong frontal system passing over Germany
weak frontal system passing over Germany
anticyclonic situation with very low winds
By playing around with the temporal resolution of the boundary data you will get some
idea how this might affect your simulation results.
Create a copy of your run script run ICON R3B08 lam and name it
run ICON R3B08 lam lowres. Replace the output directory name (EXPDIR) by
exp03 R3B08 dwdlam lowres in order to avoid overwriting your results. EX 5.2
• Halve the time resolution of your forcing (boundary) data, see Section 5.1.3 for
the namelist parameter. Write down your chosen value:
– Answer:
• Compare the results with your previous run. Does the time frequency with
which the boundary data are updated have a significant impact on the results?
You can visualize cross sections with the NCL script
case3/plot cross section.ncl and/or make use of ncview.
In this exercise we will implement two new diagnostic fields (2D variables):
This requires to modify the ICON code, where details are given in Section 5.3.
Step-by-step checklist:
it!
ti
d
Go
Di
d
Go
Di
t?
Create an (empty) subroutine calculate diagnostics
it!
ti
d
Go
Di
and place a call to this subroutine
immediately before the call to the output routine.
t?
Fill your subroutine with a 2D loop over all grid points,
it!
ti
d
Go
Di
calculate the two new quantities.
Two hints:
• Open the namelist of the previous test case 5.1 and insert the new fields in the
output specification.
This chapter provides details on some advanced technical aspects of the ICON model.
The following topics are covered: First, we briefly describe the settings for a reduced moist
physics computation and explain an option for coarse-resolution radiation grids. Then we
discuss the possibilities to write model output on regular grids and different sets of vertical
levels. Finally, before concluding this chapter with some exercises, we give a short overview
of the “defensive I/O”, that is the write-out and read-in of the model state in order to
resume the model run, say, after a previous crash.
Note that substepping is only performed for a particular tracer if a suitable horizontal
transport scheme is chosen. The horizontal transport scheme can be selected individually
for each tracer via the namelist switch ihadv tracer (transport nml). Variants of the
transport scheme with internal substepping are indicated by a two-digit number (i.e. 22,
32, 42, 52).
If moist physics are switched off above 22.5 km (default for NWP applications), internal
substepping only needs to be applied to specific humidity qv , since the advection of all
z
top_height (TOA)
off
es
ess
roc hbot_qvsubstep
o ist p
m
htop_moist_proc
n
seso
o ces
is t pr
mo
Figure 6.1.: Moist physics are switched off above htop moist proc, while tracer substep-
ping is switched on above hbot qvsubstep. (Remark: hbot qvsubstep is al-
lowed to be lower than htop moist proc)
other moisture fields is switched off anyway. However, be aware that you must explicitly
enable internal substepping if moisture physics are not switched off, or if other (non-
microphysical) tracers are added to the simulation (see, e.g., Chapter 8).
In real case simulations, radiation is one of the most time consuming physical processes.
It is therefore very desirable to reduce the computational burden without degrading the
results significantly. One possibility is to use a coarser grid for radiation than for dynamics.
Step 1. Radiative transfer computations are usually performed every 30 minutes. Before
doing so, all input fields required by the radiation scheme are upscaled to the next
coarser grid level.
Step 2. Then the radiative transfer computations are performed and the resulting short
wave transmissivities τ SW and longwave fluxes F LW are scaled down to the full grid.
Step 3. In a last step we apply empirical corrections to those fields in order to incorporate
the high resolution information about albedo α and surface temperature Tsf c again.
This is especially important at land-water boundaries and the snow line, since here
the gradients in albedo and surface temperature are potentially large.
The reduced radiation grid is controlled with the following namelist switches:
Figure 6.2.: Schematic showing how radiation is computed on a reduced (coarser) grid.
Note that running radiation on a reduced grid is the standard setting for operational
runs at DWD. Using the reduced radiation grid is also possible for the limited area mode
ICON-LAM. In this case, both the computational grid and the reduced radiation grid are
regional grids. Make sure to create the latter during the grid generation process by setting
dom(:)%lwrite parent = .TRUE., see Section 2.1.2.
Many diagnostic tools, e. g. to create contour maps and surface plots, require a regularly
spaced distribution of the data points. Therefore, the ICON model has a built-in out-
put module for the interpolation of model data from the triangular mesh onto a regular
longitude-latitude grid. Furthermore, the model output can be written on a different verti-
cal axis, e. g. on pressure levels, height levels or isentropes. In the following we will describe
how to specify these options.
All of these parameters are set in the namelist output nml. As it was already mentioned
in Section 4.3, multiple instances of this namelist may be specified for a single model run,
where each output nml creates a separate output file.
The relevant namelist parameters for the interpolation of the output fields are:
for height levels, pl varlist defines variables on pressure levels and il varlist
specifies output on isentropic levels.
There are many reasons why a simulation execution may be interrupted prematurely or
unexpectedly. The checkpoint/restart option can save you from having to start the ICON
model over from the beginning if it does not finish as expected. It allows you to restart
the execution from a pre-defined point using the data stored in a checkpoint file.
The checkpoint/restart functionality is controlled by the following namelist parameters:
Similar to the asynchronous output module, the ICON model (cf. Section 4.4) also offers
the option to reserve a dedicated MPI task for writing checkpoint files. This feature can
be enabled by setting the parameter num restart procs in the namelist parallel nml to
an integer value larger than 0.
6.5. Exercises
The necessary grids, external parameters and IFS input data on a regular grid are already
available in the directory case4/input. The missing interpolation step, which maps
the IFS data onto the triangular ICON grid is explained in Ex. 2.3.
Job submission to the Cray XC 40 can be performed on the Linux cluster lce. Note,
however, that visualization tools (CDO, NCL, ncview) are only available on
the lce!
In this lesson you will learn how to start a real-case simulation from IFS data.
Open the ICON run script case4/run ICON R02B06 ifsini and prepare the script
for running a global 48 hour forecast with 40 km horizontal resolution without nest.
The start date is EX 6.1
• Fill in the name of the ICON model binary (including the path) and several
missing namelist parameters. I. e. set ini datetime string,
end datetime string, ltestcase, ldynamics, ltransport, iforcing,
init mode, itopo as well as the filenames for the initial data and external
parameters (ifs2icon filename, extpar filename). See Section 4.2.1 and
4.2.4 for specific settings.
– Take a look at the output files in case4/output. You should find two files
named NWP ... Both should contain 6-hourly model level output up to
48 hours. However, one file contains output on the native ICON grid, the
other one output on a regular lat/lon grid. Use cdo sinfov to identify
which file is which.
In this lesson you will learn how to adjust the model output according to your needs.
• Create a copy of your run script run ICON R02B06 ifsini from Ex. 6.1 and
name it run ICON R02B06 ifsini output. In order to avoid overwriting your
EX 6.2
old results, replace the output directory name (EXPDIR) by
exp02 ifsini output.
– in NetCDF format
– containing model level output for qv, qc, qi, qr, qs, 2m temperature, total
precipitation, and mean sea level pressure. See Appendix C for variable
names and description.
• Add a second output namelist which is identical to the previous one, except for
the following changes:
– 6-hourly output should become active after 24 hours until the end of the
model run.
– Output of qv, qc, qi, qr, qs should be on height levels instead of model
levels. In addition, write out total precipitation, 2 m temperature, and
mean sea level pressure, again.
• Let the model write a restart file every 36 h. For this you have to adapt
dt checkpoint. See Section 6.4 for additional details.
– You should find three output files in your output directory. Apply the
command cdo sinfov data-file.nc > data-file.sinfov to each of your
output files. Compare with the reference output in Tables 6.1–6.3, to see
whether your output namelists are correct.
• Visualization:
– Apply the NCL script case4/zonal mean.ncl to your lat-lon output field.
The script plots vertical cross sections of zonally averaged qv as well as
qc + qi + qr + qs after 48 h and contour plots of accumulated precipitation,
sea-level pressure and 2 m temperature. Compare it to Figure 6.3.
Table 6.1.: Structure and content of file NWP DWD DOM01 ML 0001.nc
Table 6.2.: Structure and content of file NWP DWD lonlat DOM01 ML 0001.nc
Restarting a Simulation
Exercise 6.2 did not contain the sea-ice height h ice. Fortunately, the exercise has
produced a checkpoint file from which we can restart the simulation.
EX 6.3
• Restart the ICON model by setting the namelist parameter lrestart as
explained in Section 6.4 and do the following changes:
– For the resumed run, specify the sea-ice height h ice as an additional
output variable on the native ICON grid.
• After the restarted run has completed, use the cdo infov command (see
Section 7.1.2 for details) to get minimum, maximum and average value of the
sea-ice height h ice after 48 h in the output data.
MAX MIN AVG
H ICE:
m m m
• Execute the NCL script case4/sea ice plot.ncl to create a polar plot.
The following fact should become visible from the polar plot: In contrast to
the DWD analysis, the IFS analysis used in this chapter does not offer the
sea-ice height as an initialization field. Therefore, when starting from IFS
analysis, the sea ice height is always set to a constant value of 1 m. Figure 6.4
shows the sea ice height after 48 h for both the IFS- and DWD-based
initialization for reference.
Figure 6.4.: Sea-ice height on the northern hemisphere after 48h (simulation time 2017-01-
14T00:00:00), Left: Simulation started from IFS analysis as in Exercise 6.3;
Right: Simulation started from DWD analysis. Sea-ice height is not offered by
the IFS analysis. Thus, on the left, the sea-ice was initialized with a constant
height of 1 m.
In this exercise you will learn how to control the reduced (coarser) grid for radiation.
• Create another copy from your run script run ICON R02B06 ifsini from
Ex. 6.1 and name it run ICON R02B06 ifsini rg.
EX 6.4
– Change the name of the output directory (EXPDIR) to exp02 ifsini rg.
This avoids overwriting your old results.
– Switch off the reduced radiation grid. This is controlled by the namelist
switches lredgrid phys and radiation grid filename. See Section 6.2
for help.
• After the job has finished, compare the timers for radiation of your previous
run (run ICON R02B06 ifsini) and the new run
(run ICON R02B06 ifsini rg). What is the speedup of the radiation module?
Which speedup would you expect from “theory”?
Tnonrg
– Answer: Speedup achieved = Trg
– Answer: Speedup expected =
• Comparing the quality of results with and without reduced radiation grid is
beyond the scope of this tutorial. To give you a rough impression that running
forecasts with a reduced radiation grid is a valid choice, we have added
Figure 6.5. It shows verification results (BIAS and RMSE) for 850 hPa
temperature on the northern hemisphere in January with and without reduced
radiation grid.
Table 6.3.: Structure and content of file NWP DWD lonlat DOM01 HL 0001.nc
Figure 6.5.: Verification results (BIAS and RMSE) for 850 hPa temperature on the north-
ern hemisphere for January 2012 for the ICON model and DWD’s former
global model GME. Upper panel: ICON with full radiation grid. Lower panel:
ICON with reduced radiation grid. ICON (R2B6L90) is shown in red, while
the GME with 40 km horizontal resolution is shown in blue for reference. In
terms of RMSE, ICON results with and without reduced radiation grid are
barely indistinguishable.
ICON offers the possibility to produce output either in NetCDF or GRIB2 format. Many
visualization tools such as GrADS or Matlab now include packages with which NetCDF
data files can be handled. The GRIB format, which is also commonly used in meteorology,
can be processed with these tools as well. However, since the standardization of unstruc-
tured GRIB records is relatively new, many post-processing packages offer only limited
support for GRIB data that has been stored on the triangular ICON grid.
For the visualization of regular grid data we will restrict ourselves in this course to a very
simple program, ncview, which does not have a large functionality but is a very easy-to-
use program for a quick view of NetCDF output files and therefore very useful for a first
impression.
Model data that has been stored on the triangular ICON grid can be visualized with the
NCL scripting language or the Generic Mapping Tools (GMT). Section 7.3 contains some
examples how to visualize NetCDF data sets without the need of an additional regridding.
For a quick overview of dimensions and variables, the command-line utility ncdump can
be used. This program will shortly be described first. More sophisticated tools exist, for
cutting out subsets of data, e. g., and producing averages or time series. One of these tools
are the cdo utilities.
7.1.1. ncdump
Ncdump comes with the NetCDF library as provided by Unidata and generates a text
representation of a NetCDF file on standard output. The text representation is in a form
called CDL (network Common Data form Language). Ncdump may be used as a simple
browser for NetCDF data files, to display the dimension names and sizes, variable names,
types and shapes, attribute names and values, and optionally, the data values themselves
for all or selected variables in ASCII format. For example, to investigate the structure of
a NetCDF file, use
ncdump -c data-file.nc
Dimension names and sizes, variable names, dependencies and values of dimensions will
be displayed. To get only header information (same as -c option but without the values
of dimensions) use
ncdump -h data-file.nc
To display the values of a specified variable which is contained in the NetCDF file, type
to produce an annotated CDL version of the structure and the data in the NetCDF file
data-file.nc. You can also save data for specified variables for example in *.txt-files just
using:
http://www.unidata.ucar.edu/software/netcdf/ · · ·
docs/netcdf utilities guide.html#ncdump guide
The CDO (Climate Data Operators) are a collection of command-line operators to manip-
ulate and analyse NetCDF and GRIB data. The CDO package is developed and maintained
at MPI for Meteorology in Hamburg. Source code and documentation are available from
https://code.zmaw.de/projects/cdo
The tool includes more than 400 operators to print information about data sets, copy,
split and merge data sets, select parts of a data set, compare data sets, modify data
sets, arithmetically process data sets, to produce different kind of statistics, to detrend
time series, for interpolation and spectral transformations. The CDOs can also be used to
convert from GRIB to NetCDF or vice versa, although some care has to be taken there.
In particular, the ”operator” cdo infov writes information about the structure and con-
tents of all input files to standard output. By typing
in the command-line for each field the following elements are printed: date and time,
parameter identifier and level, size of the grid and number of missing values, minimum,
mean and maximum. A variant of this CDO operator is
Ncview is a visual browser for NetCDF format files developed by David W. Pierce. Using
ncview you can get a quick and easy look at regular grid data in your NetCDF files. It is
possible to view simple movies of data, view along different dimensions, to have a look at
actual data values at specific coordinates, change colormaps, invert data, etc.
ncview data-file.nc
If data-file.nc contains wildcards such as ’*’ or ’ ?’ then all files matching the description
are scanned, if all of the files contain the same variables on the same grid. Choose the
variable you want to view. Variables which are functions of longitude and latitude will be
displayed in two-dimensional images. If there is more than one time step available you can
easily view a simple movie by just pushing the forward button. The appearance of the
image can be changed by varying the colors of the displayed range of the data set values
or by adding/removing coastlines. Each one- or two-dimensional subset of the data can
be selected for visualization. Ncview allows the selection of the dimensions of the fields
available, e.g. longitude and height instead of longitude and latitude of 3D fields.
The pictures can be sent to Postscript (*.ps) output by using the function print. Be
careful that whenever you want to close only a single plot window to use the close
button, because clicking on the -icon on the top right of the window will close all ncview
windows and terminate the entire program!
Besides an interactive mode, NCL allows for script processing (recommended). NCL scripts
are processed on the command-line by typing
ncl filename.ncl
For visualizing ICON data on the native triangular grid, we recommend using NCL 6.2.0
or higher.
The following example script creates a temperature contour plot with NCL (see Figure 7.1):
begin
; read data
;
temp_ml = File->temp(:,:,:) ; dims: (time,lev,cell)
print("max T " + max(temp_ml) )
print("min T " + min(temp_ml) )
; create plot
;
wks = gsn_open_wks("ps","outfile")
gsn_define_colormap(wks,"testcmap") ; choose colormap
ResC = True
ResC@sfXArray = clon ; cell center (lon)
ResC@sfYArray = clat ; cell center (lat)
ResC@sfXCellBounds = vlon ; define triangulation
ResC@sfYCellBounds = vlat ; define triangulation
ResC@cnFillOn = True ; do color fill
ResC@cnFillMode = "cellfill"
ResC@cnLinesOn = False ; no contour lines
end
To open a data file for reading, the function addfile returns a file variable reference
to the specified file. Second, for drawing graphics, the function gsn open wks creates an
output resource, where the “ps”, “pdf” or “png” format are available. Third, the command
gsn csm contour map creates and draws a contour plot over a map.
Loading the coordinates of the triangle cell centers into NCL (resources sfXArray and
sfYArray) is essential for visualizing ICON data on the native grid. Loading the vertex
coordinates of each triangle (resources sfXCellBounds and sfYCellBounds), however, is
optional. If not given, a Delaunay triangulation will be performed by NCL, based on the
cell center information. If given, the triangles defining the mesh will be deduced by sorting
and matching vertices from adjacent cell boundaries. If you are interested in the correct
representation of individual cells, the resource sf[X/Y]CellBounds should be set.
Creating a plot can get very complex depending on how you want to look at your data.
Therefore we refer to the NCL documentation that is available online under
http://www.ncl.ucar.edu
Section 7.3.2 contains a step-by-step tutorial for another NCL example. For the exercises in
this tutorial we refer to the prepared NCL scripts. These files are stored in the subdirectory
test cases/casexx together with the model run scripts.
Figure 7.1.: ICON temperature field on a specific model level produced with the above
NCL script.
In the following we provide a detailed step-by-step tutorial for producing graphics from
an ICON data set. We will use NCL’s batch mode, i.e. instead of typing each command
in interactive mode, we will create a file visualization tutorial.ncl where a sequence
of commands can be stored and executed with
ncl visualization_tutorial.ncl
Please note that this tutorial script requires NCL version 6.2.0 or higher.
We begin by loading some NCL ”libraries” which provide high-level plotting functions
; load libraries
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_code.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/gsn_csm.ncl"
load "$NCARG_ROOT/lib/ncarg/nclscripts/csm/contributed.ncl"
These lines also contain a comment. Comment lines in NCL are preceded by the semicolon
character ’;’.
Then, as the ICON model uses an unstructured grid topology, we open and read such a
topology file, stored in NetCDF format, by the following commands:
The print command lists all variables that have been found in the NetCDF file as textual
output. For the ICON grid, the vertex positions of the grid triangles are of special interest.
They are stored as longitude/latitude positions in the vlon, vlat (this is explained in more
detail in Section 2.1.1, page 13). For NCL we convert from steradians to degrees:
rad2deg = 45./atan(1.)
vlon = gridfile->vlon * rad2deg
vlat = gridfile->vlat * rad2deg
Additionally, we load the vertex indices for each triangle edge of the icosahedral mesh.
edge_vertices = gridfile->edge_vertices
The indices are stored in the grid file data set edge vertices and reference the corre-
sponding vertices from vlon, vlat,
where
Note that by subtracting 1 we take the 0-based array indexing of NCL into account.
Furthermore, it is convenient to store the size of the edges array, i.e. the number of grid
edges, in a local variable nedges.
size_edge_vertices = dimsizes(edge_vertices)
nedges = size_edge_vertices(1)
Producing graphics with NCL requires the creation of a so-called workstation, i.e. a de-
scription of the output device. In this example, this “device” will be a PostScript file
plot.ps, but we could also define a different output format, e.g. "png" instead.
Then the map settings have to be defined and we collect these specifications in a data
structure named config1. First of all, we disable the immediate drawing of the map
image, since the ICON icosahedral grid plot will consist of two parts: the underlying map
and the grid lines. We do so by setting gsnFrame and gsnDraw to False.
We then define an orthographic projection centered over Europe. It is important that grid
lines are true geodesic lines, otherwise the illustration of the ICON grid would contain
graphical artifacts, therefore we set the parameter mpGreatCircleLinesOn.
config1 = True
config1@gsnMaximize = True
config1@gsnFrame = False
config1@gsnDraw = False
config1@mpProjection = "Orthographic"
config1@mpGreatCircleLinesOn = True
config1@mpCenterLonF = 10
config1@mpCenterLatF = 50
config1@pmTickMarkDisplayMode = "Always"
Having completed the setup of the config1 data structure, we can create an empty map
by the following command:
map = gsn_csm_map(wks,config1)
Now, the edges of the ICON grid must be added to the plot. As described before, we
convert the indirectly addressed edge vertices into an explicit list of geometric segments
with dimensions [nedges × 2]:
Figure 7.2.: The two plots generated by the NCL example script in Section 7.3.2.
ecx = new((/nedges,2/),double)
ecy = new((/nedges,2/),double)
ecx(:,0) = vlon(edge_vertices(0,:)-1)
ecx(:,1) = vlon(edge_vertices(1,:)-1)
ecy(:,0) = vlat(edge_vertices(0,:)-1)
ecy(:,1) = vlat(edge_vertices(1,:)-1)
There exists an NCL high-level command for plotting lines, gsn add polyline. Since
this function expects one-dimensional lists for its interface, we use the auxiliary function
ndtooned for reshaping the array of lines,
lines_cfg = True
lines_cfg@gsSegments = ispan(0,nedges * 2,2)
poly = gsn_add_polyline(wks,map,ndtooned(ecx),ndtooned(ecy),lines_cfg)
draw(map)
frame(wks)
The first page of the resulting PostScript file plot.ps will contain an illustration similar
to Fig. 7.2 (left part).
In order to visualize unstructured data sets that have been produced by the ICON model
they have to be stored in NetCDF format. As a second file we open such a NetCDF data
set datafile.nc in read-only mode and investigate its data set topography c:
The final step of this exercise is the creation of a contour plot from the data contained in
datafile. As it has been stated by the previous call to printVarSummary, the data sites
for the field topography c are the triangle circumcenters, located at clon, clat.
For a basic contour plot, a cylindrical equidistant projection with automatic adjustment
of contour levels will do. It is important to specify the two additional arguments sfXArray
and sfYArray.
config2 = True
config2@mpProjection = "CylindricalEquidistant"
config2@cnFillOn = True
config2@cnLinesOn = False
config2@sfXArray = clon
config2@sfYArray = clat
Afterwards, we generate the plot (page 2 in our PostScript file) with a call to
gsn csm contour map.
map = gsn_csm_contour_map(wks,topo,config2)
Note that this time it is not necessary to launch additional calls to draw and frame, since
the default options in config2 are set to immediate drawing mode.
You may wonder why the plot has a rather smooth appearance without any indication of
the icosahedral triangular mesh. What happened is that NCL generated its own Delaunay
triangulation building upon the cell center coordinates provided via clon, clat. Thus,
we are unable to locate and investigate individual ICON grid cells. In order to visual-
ize individual cells, we need to additionally load the vertex coordinates of each triangle
into NCL. This information is also available from the grid file and is stored in the fields
clon vertices, clat vertices.
By choosing the CellFill mode, it is ensured that every grid cell is filled with a single
color.
Afterwards we generate the plot once more with a call to gsn csm contour map.
map = gsn_csm_contour_map(wks,topo,config2)
GMT is an open source collection of command-line tools for manipulating geographic and
Cartesian data sets and producing PostScript illustrations ranging from simple x-y plots
via contour maps to 3D perspective views. GMT supports various map projections and
transformations and facilitates the inclusion of coastlines, rivers, and political boundaries.
GMT is developed and maintained at the University of Hawaii, and it is supported by the
National Science Foundation.
http://gmt.soest.hawaii.edu
Since GMT is comparatively fast, it is especially suited for visualizing high resolution
ICON data on the native (triangular) grid. It is capable of visualizing individual grid cells
and may thus serve as a helpful debugging tool. So far, GMT is not capable of reading
ICON NetCDF or GRIB2-output offhand. However, CDO can be used to convert your
data to a format readable by GMT.
From your NetCDF output, you should first select your field of interest and pick a single
level at a particular point in time:
Now this file must be processed further using the outputbounds command from CDO,
which finally leads to an ASCII file readable by GMT.
# Date = 2012-01-04
# Time = 00:00:00
# Name = temp
# Code = 0
# Level = 80
#
> -Z254.71
-155.179 90
36 89.5695
108 89.5695
-155.179 90
> -Z255.276
36 89.5695
36 89.1351
72 89.2658
36 89.5695
...
For each triangle, it contains the corresponding data value (indicated by -Z) and vertex
coordinates.
As a starting point, a very basic GMT script is added below. It visualizes the content of
test.gmt on a cylindrical equidistant projection including coastlines and a colorbar. An
example plot based on this script is given in Figure 7.3.
#!/bin/bash
# Input filename
INAME="test.gmt"
# Output filename
ONAME="test.ps"
# visualize coastlines
pscoast -Rd -Jq0/1:190000000 -Dc -W0.25p,black -K -O >> $ONAME
# plot colorbar
psscale -D11c/14c/18c/1.0ch -Ccolors.cpt -E -B:"T":/:K: -U -O>> $ONAME
Note: In order to get filled polygons, the -L option must be added to psxy. The purpose
of -L is to force closed polygons, which is a prerequisite for polygon filling. However, in the
latest release of GMT (5.1.2) adding this option results in very large output files whose
rendering is extremely slow. Thus, the -L option was omitted here so that only triangle
edges are drawn and colored.
Figure 7.3.: ICON temperature field on a specific model level produced with the above
GMT script.
8. Running ICON-ART
In this lesson you will learn how to run the package for Aerosols and Reactive Trace
Gases, ICON-ART (Rieger et al., 2015).
ICON-ART is an extension of the ICON model that was developed at the Institute of
Meteorology and Climate Research (IMK) at the Karlsruhe Institute of Technology (KIT).
It allows the online calculation of reactive trace substances and their interaction with the
atmosphere. The interfaces to the ART code are part of the official ICON code.
In order to obtain the ART code, the institution that wants to use ICON-ART has to sign
an additional license agreement with Karlsruhe Institute of Technology (KIT). Further
information can be found on the following website:
http://icon-art.imk-tro.kit.edu
After you have signed the license agreement, you will be provided with a compressed file
with the recent source code of ART which is called ART-v<X>.<YY>.tar.gz. <X> and <YY>
indicate the version number.
The ART directory contains several subdirectories. The purposes of those subdirectories
are explained in the following.
ICON-ART solves the diffusion equation of aerosol. For this purpose, the following pro-
cesses have to be considered: Advection1 , turbulent diffusion1 , changes due to subgrid-
scale convective transport1 , sedimentation, washout, coagulation, condensation from the
gas-phase, radioactive decay, and emissions2 . With a few exceptions (marked by 1 and
2 ), the modules calculating the tendencies due to these processes are stored within the
• 1 : The tendencies due to these processes are calculated within the ICON code. This
is part of the tracer framework of ICON.
• 2 : The emission routines are an important source for atmospheric aerosol. ART offers
the option to easily plug in new emission schemes. In order to keep clarity within
the folders, emissions routines get their own folder emissions (see below).
ICON-ART solves the diffusion equation of gaseous tracers. Besides advection, turbulent
diffusion and subgrid-scale convective transport which are treated by the ICON tracer-
framework, this includes also chemical reactions. The chemistry directory contains the
routines to calculate chemical reaction rates of gaseous species.
Within the emissions directory, emission routines for aerosol and gaseous species are
stored.
Within the externals directory, code from external libraries is stored (i.e. cloudj, meci-
con, tixi).
The Directory io
The mozart init directory contains routines needed for an initialization of ICON-ART
tracers with Mozart results.
Modules within the phy interact directory treat the direct interaction of aerosol particles
and trace gases with physical parameterizations of ICON. Examples are the interaction of
aerosols with clouds (i.e. the two-moment microphysics) and radiation.
emiss ctrl storage of example emission files for volcanic emissions, radioactive release
init ctrl location of coordinate file for MOZART initialisation and initialisation table for
LINOZ (linearized ozone) algorithm
photo ctrl location of CloudJ Input files (Cross sections, Q-Yields)
run scripts runscripts for testsuite and training course testcases
xml ctrl storage of basic .xml files for tracer registration and system files (.dtd)
The shared directory contains a collection of routines that do not fit into other categories.
This applies mostly to initialization and infrastructure routines.
The tools directory contains helpful tools for ART developers (e.g. a generalized clipping
routine).
8.3. Installation
In this section, a brief description of how to compile ICON-ART is given. The user has
to do the same steps as compiling ICON with a few additions. The reader is referred
to Section 1.2.2 or Zängl et al. (2014) in order to compile ICON successfully. First, the
ART-v<X>.<YY>.tar.gz file has to be uncompressed. You will obtain a directory, which
should be copied inside the ICON source directory $ICON-DIR/src/. In the following, we
refer to this directory $ICON-DIR/src/ART-v<X>.<YY> as $ARTDIR .
If you have compiled ICON without ART before, you have to do clean up first:
make distclean
./build_command
It is necessary for the user to choose the ICON settings carefully to obtain a stable ICON-
ART simulation with scientifically reasonable results. Hence, the user should pay special
attention to the namelist parameters listed in table 8.1.
ICON-ART has an own namelist to modify the setup of ART simulations at runtime. The
main switch for ART, lart, is located inside run nml. The namelist for the other ART
switches is called art nml.
A naming convention is used in order to represent the type of data. An INTEGER namelist
parameter starts with iart , a REAL namelist parameter start with rart , a LOGICAL
namelist parameter starts with lart , and a CHARACTER namelist parameter starts with
cart .
The ICON-ART namelist is located in the module src/namelists/mo art nml.f90. Gen-
eral namelist parameters are listed and explained within Table 8.2. Namelist parameters
for ART input are listed within Table 8.3. Namelist parameters related to atmospheric
chemistry are listed within Table 8.4. Namelist parameters related to aerosol physics are
listed within Table 8.5.
Table 8.2.: General namelist parameters to control the ART routines. These switches are
located inside art nml. The only exception is the lart switch which is located
in the run nml.
Namelist Parameter Default Description
lart .FALSE. Main switch which enables the ART
modules. Located in the namelist
run nml.
iart ntracer 0 Number of transported ART trac-
ers. This number is automatically
added to the ICON variable ntracer.
It has to be equal to the number
of tracers listed in your XML files
specified by cart chemistry xml,
cart aerosol xml and
cart passive xml.
lart chem .FALSE. Enables chemistry. The chemical mech-
anism and the according species are set
via iart chem mechanism.
lart pntSrc .FALSE. Enables point sources for passive
tracer. The sources are controled via
cart pntSrc xml. See also Section 8.4.5.
lart aerosol .FALSE. Main switch for the treatment of atmo-
spheric aerosol.
lart passive .FALSE. Main switch for the treatment of passive
tracer.
lart diag out .FALSE. If this switch is set to .TRUE., diagnos-
tic output fields are available. Set it to
.FALSE. when facing memory problems.
Table 8.3.: Namelist parameters to control ART input. These switches are located inside
art nml. For details regarding the tracer and modes initialization with XML
files, see Section 8.4.3.
cart chemistry xml ’’ Path and file name to the XML file for
specifying chemical tracer. See also Sec-
tion 8.4.3.
cart aerosol xml ’’ Path and file name to the XML file for
specifying aerosol tracer. See also Sec-
tion 8.4.3.
cart passive xml ’’ Path and file name to the XML file for
specifying passive tracer. See also Sec-
tion 8.4.3.
cart modes xml ’’ Path and file name to the XML file for
specifying aerosol modes. See also Sec-
tion 8.4.4.
cart pntSrc xml ’’ Path and file name to the XML file for
specifying point source emissions. See
also Section 8.4.5.
Table 8.4.: Namelist parameters related to atmospheric chemistry. These switches art lo-
cated inside art nml.
Namelist Parameter Default Description
iart chem mechanism 0 Sets the chemical mechanism and takes
care of the allocation of the according
species. Possible values:
0: Stratosph. short-lived Bromocarbons.
1: Simplified OH chemistry, takes pho-
tolysis rates into account
2: Full gas phase chemistry
Table 8.5.: Namelist parameters related to aerosol physics. These switches are located in
art nml .
Namelist Parameter Default Description
iart seasalt 0 Treatment of sea salt aerosol. Possible
values:
0: No treatment.
1: As specified in Lundgren et al. (2013).
iart volcano 0 Treatment of volcanic ash particles. Pos-
sible values:
0: No treatment.
1: 1-moment treatment. As described in
Rieger et al. (2015).
2: 2-moment treatment.
cart volcano file ’’ Path and filename of the input file for
the geographical positions and the types
of volcanoes.
The definition of tracers in ICON-ART is done with the use of three XML files. A des-
tinction between aerosol, chemical and passive tracers is made. Aerosol tracers are defined
by an XML file specified with the namelist switch cart aerosol xml containing all liquid
and solid particles participating in aerosol dynamics. Chemical tracers are defined by an
XML file via cart chemistry xml and cover gaseous substances participating in chemical
reactions. Passive comprises all gaseous, liquid and solid tracers that are only participating
in transport processes. These are specified with the XML file cart passive xml. With the
following example XML file, two passive tracers called trPASS1 and trPASS2 are defined:
<tracers>
<passive id="trPASS1">
<transport type="char">stdaero</transport>
<unit type="char">kg kg-1</unit>
</passive>
<passive id="trPASS2">
<transport type="char">stdaero</transport>
<unit type="char">kg kg-1</unit>
</passive>
</tracers>
Additionally, each tracer gets two different meta data. Firstly, transport defines a template
of horizontal and vertical advection schemes and flux limiters, in this example the template
stdaero is used. A description of available transport templates is given below. Secondly, a
unit is specified, which will also be added as meta data to any output of the tracer. Note,
that the type of meta data has to be specified. In this example, both meta data are of
type character (”char”). For other meta data you could also choose integer (”int”) or real
(”real”).
Passive Tracers
For passive tracers, the only required meta data is unit (char). You can find an exam-
ple for the passive tracer XML file at the runctrl examples/xml ctrl folder named
tracers passive.xml.
Chemical Tracers
For chemical tracers, the only required meta data is unit (char). Usually, mol weight
(molecular weight, real) is also needed, although it is technically not required. You can
find an example for the chemical tracer XML file at the runctrl examples/xml ctrl
folder named tracers chemtracer.xml.
Aerosol Tracers
For aerosol tracers, there is a list of necessary meta data specifications: the meta data unit
(char), moment (int), mode (char), sol (solubility, real), rho (density, real) and mol weight
(molecular weight, real) are required. You can find an example for the aerosol tracer XML
file at the runctrl examples/xml ctrl folder named tracers aerosol.xml.
Currently, there are three different transport templates available: off, stdaero and
stdchem. These templates avoid the necessity to add a tracer advection scheme and
flux limiter for each single tracer in the namelist. Hence, the values of the namelist para-
meters ihadv tracer, ivadv tracer, itype hlimit and itype vlimit are overwritten
by the template. Specific information concerning the advection schemes mentioned in the
following can be found in Zängl et al. (2014)
Specifying off, all advective transport for this tracer is turned off (i.e. ihadv tracer and
ivadv tracer are set to 0).
The transport template stdaero uses a combination of Miura and Miura with subcy-
cling for the horizontal advection (i.e. ihadv tracer = 22), 3rd order piecewise parabolic
method handling CFL >1 for vertical advection (i.e. ivadv tracer = 3) in combina-
tion with monotonous flux limiters (i.e. itype hlimit = 4 and itype vlimit = 4). This
means that the conservation of linear correlations is guaranteed which is important for
modal aerosol with prognostic mass and number and diagnostic diameter. By this, the
diameter of aerosol does not change due to transport.
The transport template stdchem uses the same advection schemes as stdaero (i.e.
ihadv tracer = 22 and ivadv tracer = 3). However, the considerably faster positive
definite flux limiters are used (i.e. itype hlimit = 3 and itype vlimit = 2). By this,
the mass is still conserved. However, the conservation of linear correlations is traded for a
faster computation of the advection.
As you might have noticed in the previous section, the choice of a transport template is
not required at the tracer definition. If no transport template is chosen, stdaero is used
as default template.
Similar to the previously described tracer definition with XML files, aerosol modes
are also defined with XML files. The according namelist parameter specifying the
XML file is called cart modes xml. An example file modes.xml is provided in the
runctrl examples/xml ctrl folder. The definition of a mode is done as shown in the
following example:
<modes>
<aerosol id="seasa">
<kind type="char">2mom</kind>
<d_gn type="real">0.200E-6</d_gn>
<d_gm type="real">0.690E-6</d_gm>
<sigma_g type="real">1.900E+0</sigma_g>
<rho type="real">2.200E+3</rho>
</aerosol>
</modes>
In this example, a mode called seasa is defined with 2 prognostic moments (2mom). The
initial number and mass diameters (d gn and d gm) as well as the geometric standard
deviation (sigma g) and density (rho) are specified. For an aerosol tracer, that shall be
associated to this mode, the meta data mode has to be set to seasa at the tracer defini-
tion (see previous section). In general, all available modes are listed in the example file
modes.xml. Hence, it is highly recommended to adapt this file according to your simulation
setup.
ICON-ART provides a module which adds emissions from point sources to existing tracers.
The namelist switches associated to this module are lart passive, lart pntSrc and
cart pntSrc xml (see Section 8.4.2).
The prerequisite is that you have added a passive tracer via an XML file using the namelist
parameter cart passive xml. Starting from this point, point sources can be added using
an XML file specified via cart pntSrc xml. Additionally, you have to set lart passive
and lart pntSrc to .TRUE.. Inside the XML file specified via cart pntSrc xml, you can
add point sources following the subsequent example:
<sources>
<pntSrc id="RNDFACTORY">
<substance type="char">testtr</substance>
<lon type="real">8.00</lon>
<lat type="real">48.00</lat>
<source_strength type="real">1.0</source_strength>
<height type="real">10.</height>
<unit type="char">kg s-1</unit>
</pntSrc>
</sources>
The options you can specify here have the following meaning:
• pntSrc id: The name of the point source. This information is actually not used in
the ICON-ART code and serves only for a better readability of the XML file. Hence,
also multiple point sources with the same id are technically allowed.
• substance: This is the name of the substance, the point source emission is added
to. Here you have to specify the very same name of the tracer that you have specified
in the cart passive xml file.
• source strength: The source strength of the point source in the unit specified
below.
• unit: The unit of the source strength. Note, that currently every unit different from
kg s-1 will lead to a model abort, as no unit conversion is implemented so far.
Via the XML file you can also specify multiple point sources. By this, you can either add
point sources to different tracers or specifying mutiple source for a single tracer with for
example differing source strenghts.
If volcanic eruptions should be considered the switch iart volcano has to be set. For
the 1-moment description of volcanic ash, i.e. six monodisperse size bins for the number
concentrations, the integer value is 1. For the 2-moment description where 3 lognormal
modes are used, the switch has to be set to 2.
Further input is necessary to define the appropriate volcano(s):
With this setup ICON-ART performs a simulation with the standard parameters for the
respective volcano type.
8.5. Output
In principle, output of ICON-ART variables works the same way as for ICON variables.
The following five quantities of the output have to be specified:
In general, the output of all (prognostic) tracers defined in the different XML
files (passive, chemistry, aerosol) is possible. Additionally, several diagnostic output
variables have been added in ICON-ART. These are listed in table 8.6.
There is an option to obtain all variables belonging to a certain group without having
to specifying all of them. The output variables that are associated to this group will be
written. Available output groups are: ART AERO VOLC1 , ART AERO RADIO2 , ART AERO DUST3 ,
ART AERO SEAS4 , ART CHEMTRACER 5 and ART PASSIVE6 . As the names indicate, these
groups contain variables associated to 1 volcanic ash aerosol, 2 radioactive particles,
3 mineral dust aerosol, 4 sea salt aerosol, 5 chemical tracer and 6 passive tracer.
Table 8.6.: Selected list of available diagnostic output fields for aerosol.
Variable Associated Description Groups
namelist switch
seasa diam iart seasalt = 1 Median diameter of sea salt ART AERO
seasb diam lart diag out mode A, B, C respectively SEAS
seasc diam = .true.
asha diam iart volcano = 2 Median diameter of vol- ART AERO
ashb diam lart diag out canic ash mode A, B, C re- VOLC
ashc diam = .true. spectively
tau volc 340nm iart volcano = 2 Volcanic ash optical depth ART AERO
tau volc 380nm lart diag out at the wavelength indi- VOLC
tau volc 440nm = .true. cated by the name
tau volc 500nm
tau volc 550nm
tau volc 675nm
tau volc 870nm
tau volc 1020nm
tau volc 1064nm
ash total mc iart volcano = 2 Total concentration of vol- ART AERO
lart diag out canic ash in column VOLC
= .true.
8.6. Exercises
In order to start with the exercises, you have to copy and unpack the ART code inside
your ICON source directory as described in section 8.3.
You will find the ART code at:
/e/uwork/trng024/packages/ART-v2.1.00.tar.gz.
Folders with the input data for all ART exercises can be found at:
/e/uwork/trng024/packages/ART-INPUT.
After you have copied the source code, you have to install ICON-ART. For this purpose,
proceed as described in Section 8.3. After a successful compilation of ICON-ART, you can
start with the experiments, that were prepared for this purpose:
In this exercise, you will learn to add your own tracers with emissions from point
sources. This exercise makes use of the same setup for ICON-LAM as you have
already used in the Ex. 5.1 of Chapter 5. You are free to choose the tracer(s) to EX 8.1
transport and the location of the point source(s).
• You will find a copy of the runscript used in the previous ICON-LAM test case
inside $ARTDIR/runctrl examples/run scripts called
exp.art.trng17.case1.pntSrc.
• Edit the run script according to the namelist parameters you find in
Section 8.4. You will find a ? at all places where you have to edit something.
• Create the XML files that are needed to define tracers (see section 8.4.3) and
point sources (see section 8.4.5).
Biogenic emitted Very Short-lived Substances (VSLS) have a short chemical lifetime
in the atmosphere compared to tropospheric transport timescales. As the ocean is
the main source of the most prominent VSLS, bromoform (CHBr3) and EX 8.2
dibromomethane (CH2Br2), this leads to large concentration gradients in the
troposphere. The tropospheric depletion of CHBr3 is mainly due to photolysis,
whereas for CH2Br2 the loss is dominated by oxidation by the hydroxyl radical
(OH) both contributing to the atmospheric inorganic bromine (Bry) budget. Once
In this exercise the fast upward transport of both VSLS from the lower boundary
into the upper troposphere / lower stratosphere (UTLS) due to the super-typhoon
Haiyan will be simulated.
• Inside the ART-INPUT folder you will find a folder called CASE2-VSLS
containing all input data required for this test case.
• Inside $ARTDIR/runctrl examples/run scripts you will find the run script
for this test case called exp.art.trng17.case2.vsls. Edit the run script
according to the namelist parameters you find in Section 8.4. You will find a ?
at all places where you have to edit something.
• Create the XML file that is needed to define tracers (see section 8.4.3). For
this configuration, you will have to add tracers for CHBr3 and CH2Br2.
Volcano Eruption
• Inside the ART-INPUT folder you will find a folder called CASE3-VOLC
containing all input data required for this test case.
• Inside $ARTDIR/runctrl examples/run scripts you will find the run script
for this test case called exp.art.trng17.case3.volc. Edit the run script
according to the namelist parameters you find in Section 8.4. You will find a ?
at all places where you have to edit something. For cart volcano file, you
can use the file ART-INPUT/CASE2-VOLC/volcano list Eyjafjoell.txt.
• Create the XML file that is needed to define tracers (see section 8.4.3). You
also have to create an XML file specifying the aerosol modes in the simulation
(see section 8.4.4). For the 2-moment description of volcanic ash, you will need
the modes asha, ashb and ashc and the according tracers.
• You can use the namelist switch lart diag out to obtain diagnostical
properties like aerosol optical depth and median diameters.
Sea salt aerosol is one of the main contributors to natural atmospheric aerosol.
With its high hygroscopicity it is a very efficient cloud condensation nuclei (CCN).
Within ICON-ART, sea salt aerosol is described as a log-normally distributed EX 8.4
aerosol in three modes with prognostic mass mixing ratios and prognostic number
mixing ratios. This simulation includes also a nested domain covering Europe and
North Africa with a higher spatial resolution. In order to perform the simulation,
you have to do the following steps:
• Inside the ART-INPUT folder you will find a folder called CASE4-SEAS
containing all input data required for this test case.
• Inside $ARTDIR/runctrl examples/run scripts you will find the run script
for this test case called exp.art.trng17.case4.seas. Edit the run script
according to the namelist parameters you find in section 8.4. You will find a ?
at all places where you have to edit something.
• Create the XML file that is needed to define tracers (see section 8.4.3). You
also have to create an XML file specifying the aerosol modes in the simulation
(see section 8.4.4). For the 2-moment description of sea salt, you will need the
modes seasa, seasb and seasc and the according tracers.
Numerical weather prediction (NWP) is an initial value problem. The ability to make a
skillful forecast heavily depends on an accurate estimate of the present atmospheric state,
known as analysis. In general, an analysis is generated by combining, in an optimal way,
all available observations with a short term forecast of a general circulation model (e.g.
ICON).
Stated in a more abstract way, the basic idea of data assimilation is to fit model states x to
observations y. Usually, we do not observe model quantities directly or not at the model
grid points. Here, we work with observation operators H which take a model state and
calculate a simulated observation y = H(x). In terms of software, these model opera-
tors can be seen as particular modules, which operate on the ICON model states. Their
output is usually written into so-called feedback files, which contain both the real obser-
vation ymeas with all its meta data (descriptions, positioning, further information) as well
as the simulated observation y = H(x).
However, data assimilation cannot be treated at one point in time only. The information
passed on from the past is a crucial ingredient for any data assimilation scheme. Thus,
cycling is an important part of data assimilation. It means that we
1. Carry out the core data assimilation component to calculate the so-called analy-
sis x(a) , i.e. a state which best fits previous information and the observations y,
(a)
2. Propagate the analysis xk to the next analysis time tk+1 . Here, it is called first
(b)
guess or background xk+1 .
3. Carry out the next analysis by running the core data assimilation component, gen-
(a)
erating xk+1 , then cycling the steps.
See Figure 9.1 for a schematic of the basic assimilation process.
Figure 9.1.: Basic ICON cycling environment using 3DVar. Observations are merged with
a background field taken from a 3 h forecast (first guess) of the ICON model.
Courtesy of R. Potthast, DWD.
where B is the background state distribution covariance matrix which is making sure
that the information which is available at some place is distributed into its neighborhood
properly, and R is the error covariance matrix describing the error distribution for the
observations. The minimizer of (9.1) is given by
The background or first guess x(b) is calculated from earlier analysis by propagating the
model from a state xk−1 at a previous analysis time tk−1 to the current analysis time tk . In
the data assimilation code, the minimization of (9.1) is not carried out explicitly by (9.2),
but by a conjugate gradient minimization scheme, i.e. in an iterative manner, first solving
the equation
(b)
(R + HBH T )z = y − H(xk )
in observation space calculating zk at time tk , then projecting the solution back into model
space by
(a) (b)
δxk = xk − xk = BH T zk .
We call δxk the analysis increment.
The background covariance matrix B is calculated from previous model runs by statistical
methods. We employ the so-called NMC method initially developed by the US weather
bureau. The matrix B thus contains statistical information about the relationship between
different variables of the model, which is used in each of the assimilation steps.
To obtain a better distribution of the information given by observations, modern data as-
similation algorithms employ a dynamical estimator for the covariance matrix (B-matrix).
Given an ensemble of states x(1) , ..., x(L) , the standard stochastic covariance estimator cal-
culates an estimate for the B-matrix by
L
1 X (`) (`)
B= (xk − xk )(xk − xk )T , (9.3)
L−1
`=1
This is leading us to the Ensemble Kalman Filter (EnKF), where an ensemble is employed
for data assimilation and the covariance is estimated by (9.3). Here, we use the name
EnKF (ensemble Kalman filter) as a generic name for all methods based on the above
idea.
In principle, the EnKF carries out cycling as introduced above, just that the propagation
(a,`)
step carries out propagation of a whole ensemble of L atmospheric states xk from time tk
to time tk+1 , and the analysis step has to generate L new analysis members, called the
analysis ensemble based on the first guess or background ensemble x(b,`) , ` = 1, ..., L.
Usually, the analysis is carried out in observation space, where a transformation is carried
out. Also, working with a low number of ensemble members as it is necessary for large-scale
data assimilation problems, we need to suppress spurious correlations which arise from a
naive application of (9.3). This technique is known as localization, and the combined
transform and localization method is called localized ensemble transform Kalman filter
(LETKF), first suggested by Hunt et al. (2007).
The DWD data assimilation coding environment (DACE) provides a state-of-the-art im-
plementation of the LETKF which is equipped with several important ingredients such
as different types of covariance inflation. These are needed to properly take care of the
modeling error. The original Kalman filter itself does not know what error the model has
and thus by default under-estimates this error, which is counter-acted by a collection of
tools.
The combination of variational and ensemble methods provides many possibilities to fur-
ther improve the state estimation of data assimilation. Based on the ensemble Kalman
filter LETKF the data assimilation coding environment provides a hybrid system EnVAR,
the ensemble variational data assimilation.
The basic idea of EnVAR is to use the dynamical flow dependent ensemble covariance
matrix B as a part of the three-dimensional variational assimilation. Here, localization
is a crucial issue, since in the LETKF we localize in observation space, but 3D-VAR
employs B in state space. Localization is carried out by a diffusion-type approximation in
DACE.
The cycling for the EnVAR needs to cycle both the ensemble x(`) , ` = 1, ..., L and one
deterministic state xdet . The resolution of the ensemble can be lower than the full deter-
ministic resolution. By default we currently employ a 40 km resolution for the ensemble
and a 13 km global resolution for the deterministic state. The ensemble B matrix is then
carried over to the finer deterministic resolution by interpolation. See Section 9.2 for more
details on the operational assimilation system at DWD.
DACE provides additional modules for Sea Surface Temperature (SST) analysis, Soil Mois-
ture Analysis (SMA) and snow analysis. Characteristic time scales of surface and soil
processes are typically larger than those of atmospheric processes. Therefore, it is often
sufficient to carry out surface analysis only every 6 to 24 hours.
The Assimilation cycle iterates the steps described in Section 9.1: updating a short-range
ICON forecast (first guess) using the observations available for that time window to gen-
erate an analysis, from which then a new updated first guess is started.
The core assimilation for atmospheric fields is based on a hybrid system (EnVar) as de-
scribed in Section 9.1.3. At every assimilation step (every 3 h) an LETKF is ran using an
ensemble of ICON first guesses. Currently, the ensemble consists of 40 members with a
horizontal resolution of 40 km and a 20 km nest over Europe. A convex linear combination
of the 3D-VAR climatological and the LETKF’s (flow dependent) covariance matrix is
then used to run a deterministic 3D-VAR analysis at 13 km horizontal resolution, which
is then used to initialize a deterministic main forecast at the same resolution.
In addition, the above mentioned surface modules are ran: Sea Surface Temperature (SST)
analysis, Soil Moisture Analysis (SMA) and snow analysis.
Note that for the ICON-EU nest no assimilation of atmospheric fields is conducted. The
analysis fields necessary to initialize the nest are interpolated from the underlying global
grid. A separate surface analysis, however, is conducted.
The input, output and processes involved in the assimilation cycle are briefly described
below:
Atmospheric analysis
Carried out at every assimilation time step (3 h) using the data assimilation algorithms
described in the previous sections.
Main input: First guess, observations, previous analysis error, online bias correction files.
Main output: Analysis of the atmospheric fields, analysis error, bias correction files,
feedback files with information on the observation, its departures to first guess and analysis.
The system can make use of the following observations: radiosondes, weather stations,
buoys, aircraft, ships, radio occultations, AMV winds and radiances. Available general
features of the module are variational quality control and (variational) online bias correc-
tion. Regarding EnKF specifics, different types of inflation techniques, relaxation to prior
perturbations and spread, adaptive localization, SST perturbations and SMA perturba-
tions are available.
Snow analysis
Fields modified by the snow analysis: (see Appendix C for a description of each
variable) freshsnow, h snow, rho snow, t snow, w i, w snow.
Main input: SYNOP snow depth observations if the coverage is sufficient. If this is not
the case, more sources of information are looked for until the number of observations is high
enough, namely (and in this order), precipitation and 2m temperature, direct observations
(wwreports) and the NCEP external snow analysis.
Fields modified by the SST analysis: (see Appendix C for a description of each
variable) fr seaice, h ice, t ice, t so.
Main input: NCEP analysis from the previous day (which uses satellite, buoy and ship
observations, to be used as a first guess), ship and buoy observations available since the
time of the NCEP analysis.
Fields modified by the SMA analysis: (see Appendix C for a description of each
variable) w so.
Main input: Background fields for relevant fields at every hour since last assimilation,
2m-temperature analysis (see below) to be used as observations.
2m-temperature analysis
Although carried out only at 0 UTC, it is ran for several time steps in between to provide
the output (2 m temperature) needed by the SMA analysis. Uses observations from SYNOP
stations on land and METAR information from airports.
The NWP exercises this week will be mainly carried out on the Cray XC 40 supercomputer
system at DWD. This system consists of several compute nodes with corresponding service
nodes. Some of the service nodes are the so-called login-nodes (with names xce00.dwd.de
and xce01.dwd.de), on which you can use training accounts.
• xce00, xce01:
These are the login nodes, which run a SUSE Linux Enterprise (SLES) Linux. The
nodes are used for compiling and linking, preparation of data, basic editing work
and visualization of meteorological fields. They are not used for running parallel
programs, but jobs can be submitted to the Cray XC 40 compute nodes.
• Cray XC 40:
The Cray XC 40 has 432 compute nodes, where each node is equipped with 2 Intel
Haswell processors with 12 cores (a second node partition with 544 Intel Broadwell
nodes will not be used in these exercises). Each node therefore has 24 computational
cores. These nodes cannot be accessed interactively, but only by batch jobs. Such
jobs can use up to 62 GByte of main memory per node, which is about 2.5 GByte
per core. For normal test jobs it should be enough to use 10-15 nodes (depending on
the chosen grid resolution).
There is a common filesystem across all nodes and every user has three different main
directories:
• /e/uhome/username ($HOME)
Directory for storing source code and scripts to run the model. This is a Panasas file
system suitable for many small files.
• /e/uwork/username ($WORK)
Directory for storing larger amounts of data.
• /e/uscratch/username ($SCRATCH)
Directory for storing very large amounts of data. For the $WORK and the $SCRATCH
filesystem there is a common quota for every user of 2.5 TByte.
Jobs for the Cray XC 40 system have to be submitted with the batch system PBS. These
batch jobs may either be launched from the Linux cluster lce or from the XC 40 login
nodes xce00/01. Together with the source code of the programs we provide some run
scripts in which all necessary batch-commands are set.
Here are the most important commands for working with the PBS:
qsub job name to submit a batch job to PBS. This is done in the run scripts.
qstat to query the status of all batch jobs on the XC 40. You can
see whether jobs are Q (queued) or R (running). You have
to submit jobs to the queue xc norm h.
qstat -u user to query the status of all your batch jobs on the machine.
qdel job nr@machine to cancel your job(s) from the batch queue of a machine.
The job nr is given by qstat -w.
In your run scripts, execution begins in your home directory, regardless of what directory
your script resides in or where you submitted the job from. You can use the cd command
to change to a different directory. The environment variable $PBS O WORKDIR makes it easy
to return to the directory from which you submitted the job:
cd $PBS_O_WORKDIR
When you work with the ICON software package, you can have a lot of trouble. Some of
the problems are due to imperfect (or even missing) documentation, others are caused by
program bugs. We apologize right now for any inconvenience this may cause. But there
are also troubles resulting from compiler bugs or hardware problems.
In the following, we want to describe some problems that can occur during the single
phases you are going through when installing and running the model. If possible, we also
want to give some hints for the solution.
These are the most common difficulties when compiling and linking the software:
• IFS2ICON: Check the data files retrieved from the ECMWF MARS database. All
fields that are defined in your iconremap namelist settings must be contained in this
input data file. For GRIB1 input data, provide the correct field parameter with the
namelist parameter code.
• If the cause of the error still does not become clear, you may increase model output
verbosity by setting the command-line options -v, -vv, -vvv etc.
• Stop right there, and don’t move. Speak to the bear in a low, calm voice, and slowly
raise your arms up above your head. Clearly, you should try to leave now. Do it
slowly and go back from whence you came. Don’t cross the path of the bear (or any
cubs, if present).
caveat: Defaults defined in the ICON code after the namelist read-in are not moni-
tored.
Finally, your namelist settings may have been specified w.r.t. a former version of
ICON. Then, there may be the case that certain parameters have been declared dep-
recated. Take a look at the namelist documentation doc/Namelist overview.pdf
which comes with your version of the ICON model code. This document contains a
section on incompatible changes, and may provide some useful hints.
• If the cause of the error still does not become clear, you may increase the model
output verbosity, see the namelist parameter msg level in the namelist run nml.
There are surely many more reasons for problems and errors, whose discussion goes beyond
the scope of this tutorial. It really gets troublesome when a program aborts and writes
core files. Then you will need some computer tools (like a debugger) to investigate the
problem. If you cannot figure out what the reason for your problem is we can try to give
some support.
The following table contains the NWP variables available for output1 . Please note that
the field names are following an ICON-internal nomenclature, see Section 4.3 for details.
By ”ICON-internal” variable names, we denote those field names that are pro-
vided as the string argument name to the subroutine calls CALL add_var(...) and
CALL add_ref(...) inside the ICON source code. These subroutine calls have the pur-
pose to register new variables, to allocate the necessary memory, and to set the meta-data
for these variables.
Therefore, if you are interested in the model output of a certain variable and if this
variable is not listed in the table below, you may search for the corresponding call to
add_var/add_ref in the source code instead.
Variable name Description
Bibliography
Bloom, S. C., L. L. Takacs, A. M. D. Silva, and D. Ledvina, 1996: Data assimilation using
incremental analysis updates. Mon. Wea. Rev., 124, 1256–1270.
Gal-Chen, T. and R. Somerville, 1975: On the use of a coordinate transformation for the
solution of the Navier-Stokes equations. J. Comput. Phys., 17, 209–228.
Hunt, B. R., E. Kostelich, and I. Szunyogh, 2007: Efficient data assimilation for spatiotem-
poral chaos: A Local Ensemble Transform Kalman Filter. Physica D, 230, 112–126.
Jablonowski, C., P. Lauritzen, R. Nair, and M. Taylor, 2008: Idealized test cases for the
dynamical cores of Atmospheric General Circulation Models: A proposal for the NCAR
ASP 2008 summer colloquium. National Center for Atmospheric Research (NCAR).
Jablonowski, C. and D. L. Williamson, 2006: A baroclinic instability test case for atmo-
spheric model dynamical cores. Q. J. R. Meteorol. Soc., 132, 2943–2975.
Klemp, J., 2011: A terrain-following coordinate with smoothed coordinate surfaces. Mon.
Wea. Rev., 139, 2163–2169.
Leuenberger, D., M. Koller, and C. Schär, 2010: A generalization of the sleve vertical
coordinate. Mon. Wea. Rev., 138, 3683–3689.
Lott, F. and M. J. Miller, 1997: A new subgrid-scale orographic drag parametrization: Its
formulation and testing. Q. J. R. Meteorol. Soc., 123(537), 101–127.
Lundgren, K., B. Vogel, H. Vogel, and C. Kottmeier, 2013: Direct radiative effects of sea
salt for the mediterranean region under conditions of low to moderate wind speeds. J.
Geophys. Res., 118(4), 1906–1923.
Mironov, D., B. Ritter, J.-P. Schulz, M. Buchhold, M. Lange, and E. Machulskaya, 2012:
Parameterisation of sea and lake ice in numerical weather prediction models of the
german weather service. Tellus A, 64(0).
Neggers, R. A. J., M. Köhler, and A. C. M. Beljaars, 2009: A dual mass flux framework
for boundary layer convection. Part I: Transport. Journal of the Atmospheric Sciences,
66(6), 1465–1487.
Orr, A., P. Bechtold, J. Scinocca, M. Ern, and M. Janiskova, 2010: Improved middle
atmosphere climate and forecasts in the ecmwf model through a nonorographic gravity
wave drag parameterization. Journal of Climate, 23(22), 5905–5926.
Polavarapu, S., S. Ren, A. M. Clayton, D. Sankey, and Y. Rochon, 2004: On the rela-
tionship between incremental analysis updating and incremental digital filtering. Mon.
Wea. Rev., 132, 2495–2502.
Prill, F., 2014: DWD ICON Tools Documentation. Deutscher Wetterdienst (DWD).
dwd icon tools/doc/icontools doc.pdf.
Raschendorfer, M., 2001: The new turbulence parameterization of LM. In COSMO News
Letter No. 1, Consortium for Small-Scale Modelling, 89–97.
Schär, C., D. Leuenberger, O. Fuhrer, D. Lüthi, and C. Girard, 2002: A new terrain-
following vertical coordinate formulation for atmospheric prediction models. Mon. Wea.
Rev., 130, 2459–2480.
Schrodin, R. and E. Heise, 2002: A new multi-layer soil-model. In COSMO News Letter
No. 2, Consortium for Small-Scale Modelling, 149–151.
Tiedtke, M., 1989: A comprehensive mass flux scheme for cumulus parameterization in
large-scale models. Mon. Wea. Rev., 117(8), 1779–1800.
Zängl, G., D. Reinert, F. Prill, M. Giorgetta, L. Kornblueh, and L. Linardakis, 2014: ICON
User’s Guide. DWD & MPI-M.
Zängl, G., D. Reinert, P. Ripodas, and M. Baldauf, 2015: The ICON (ICOsahedral
Non-hydrostatic) modelling framework of DWD and MPI-M: Description of the non-
hydrostatic dynamical core. Q. J. R. Meteorol. Soc., 141, 563–579.
icon-dev/doc/Namelist overview.pdf
for a complete list of available namelist parameters for the ICON model. ICON-ART
specific namelists are described in the ICON-ART documentation.
initicon nml (Namelist), 54 grid nml (Namelist), 39, 44, 45, 47, 52, 67,
71, 86, 87
ana varnames map file, 53 gridgen nml (Namelist), 16, 17
art nml (Namelist), 114
atmo dyn grids, 47 h levels, 88
hbot qvsubstep, 85, 85, 86
bdy indexing depth, 16, 16 hl varlist, 87, 87
htop moist proc, 85, 85, 86, 91, 93
cart volcano file, 124
i levels, 88
diffusion nml (Namelist), 43
iart seasalt, 122
dom, 57, 58
iart volcano, 122
dt checkpoint, 88, 88, 89
iforcing, 42, 43, 47, 52, 63, 89
dt checkpoint (Namelist), 91
ifs2icon filename, 55, 89
dt conv, 51, 51
ihadv tracer, 85
dt gwd, 51
il varlist, 87, 88
dt iau, 55, 55, 56, 64
in filename, 33
dt rad, 51, 51
in grid filename, 31, 33
dt restart, 88, 89
in type, 31
dt shift, 55, 55, 56, 64
ini datetime string, 51, 51, 54, 55, 63, 82, 89
dt sso, 51
init mode, 53, 53–55, 64, 73, 73, 82, 89
dtime, 46, 65
initicon nml (Namelist), 53–55, 73
dtime latbc, 73, 82
input field nml (Namelist), 27, 28, 31
dwdana filename, 54, 54, 55, 64
inwp cldcover, 75
dwdfg filename, 54, 54, 55, 64, 73, 82
inwp convection, 75
dynamics grid filename, 44, 52, 53, 63, 67
inwp gscp, 75, 75, 78, 78
dynamics nml (Namelist), 43
inwp gwd, 75
dynamics parent grid id, 45, 47, 52, 67
inwp radiation, 75
end datetime string, 51, 51, 63, 82, 89, 93 inwp sso, 75
extpar filename, 52, 52, 64, 89 inwp surface, 75
extpar nml (Namelist), 43, 52 inwp turb, 75
io nml (Namelist), 57, 88
filetype, 57, 57 itopo, 43, 43, 47, 52, 63, 89
flat height, 40 itype latbc, 72
ncstorage file, 31
ndyn substeps, 49, 50, 65
nh test name, 43, 43, 47
nh testcase nml (Namelist), 37, 43, 48
nlev latbc, 73, 82
nonhydrostatic nml (Namelist), 38, 40, 43,
49, 85
nsteps, 46, 51, 93
ntiles, 19
num io procs, 58, 63, 89
num lev, 38, 40, 47, 67
num prefetch proc, 74
num restart procs, 58, 89
nwp phy nml (Namelist), 43, 51, 75, 78
List of Exercises