Storage Handling and Diagnostic System (STASH) : Unified Model Documentation Paper C04
Storage Handling and Diagnostic System (STASH) : Unified Model Documentation Paper C04
UM Version : 10.1
Last Updated : 2015-03-13 (for vn10.1)
Owner : Richard Barnes
Met Office
FitzRoy Road
Exeter
Devon EX1 3PB
United Kingdom
This document has not been published; Permission to quote from it must be obtained from the Unified Model
system manager at the above address
UMDP: C04
Storage Handling and Diagnostic System (STASH)
Contents
1 Introduction 1
6 Acronyms 11
1 Introduction
The STASH system embraces the logical components of the UM responsible for generating versatile and op-
tional model diagnostic fields for a range of model configurations and applications, such that output is in a
standard format — the model fieldsfile — and new diagnostics can be readily introduced. In order to meet these
goals, STASH needs to know the data locations of all prospective data for output, and this is achieved by STASH
also controlling the set-up of all data addressing within the model.
Since the UM provides for automatic set-up of different configurations of the model, with different choices of
physics schemes and therefore potentially different combinations of prognostic and ancillary fields, STASH also
needs to have information on the choice of schemes, and which diagnostics are available for which scheme.
Each data field has its STASHcode (a combined model, section and item number) which uniquely labels any
primary, ancillary or diagnostic field. The basic building block of data for STASH is usually a horizontal field,
and most processing is performed level by level. STASH does not perform horizontal or vertical interpolation —
for example output fields on pressure or height surfaces must first be interpolated before being output by the
STASH routines.
The control of diagnostic requests begins in the GUI, where users can select from tables of available diagnostics.
During job processing a file named STASHC is created. Appendix A describes the content of STASHC namelists
produced by the Rose GUI. This file is read during model initialisation at run time and converted into an internal
collection of STASH related arrays, in particular a “STASHlist” array STlist that holds the STASH requests. During
execution of each timestep of the model run, there are a number of calls to the service subroutine STASH, which
may then extract, process and output any data requests, usually to a set of fieldsfiles on external disk storage.
The acronym STASH stands for “Spatial and Temporal Averaging and Storage Handling”. In view of the com-
plexity of tasks handled by the STASH system, it is worth mentioning the motivation which lay behind its original
development in the 1990s. A key aspect of the UM’s design was “plug-compatibility”. Plug-compatibility rules, to
which the “physics” routines conform, require that subroutines make no assumptions about the format of perma-
nent storage, that they do not call any routines which do so, and that they communicate with other routines only
via their argument lists. “Dynamics” routines also obey these rules as far as possible, given that they have to
calculate horizontal differences. It was desirable that any experimental diagnostic could be easily introduced into
a plug-compatible routine (PCR) without the developer having to know anything about the storage or averaging
needed.
To facilitate this, a very general user-friendly storage and averaging system is required. The system must cope
with a variety of non-meteorological processing such as averaging, but physical diagnostic calculations do not
come within its remit — it only deals with their results. It must be able to apply several possible treatments in time
(e.g. replace, accumulate), and a number of options for spatial treatment (e.g. full fields, zonal and meridional
means, limited-area and single-point data). A particular quantity may be diagnosed in more than one way at
once. Code of the complexity required for STASH processing is best not duplicated in different control routines,
and for clarity it is better kept distinct from other control functions. To reduce overheads, the various STASH
processing functions should be grouped in a single set of modular routines. Part of the design complexity lay in
the requirement to minimise memory overheads and re-use diagnostic space whenever possible.
Routines below subroutine STASH themselves form a generic system which is called by each internal model
section. The input to subroutine STASH comes either from the main data array D1 (a prognostic or a diagnostic
previously stored there by STASH) or from STASH’s own work array STASHwork. Diagnostic output produced
by a PCR is either stored in a local work array and later copied to STASHwork, or stored directly if STASHwork is
passed as an argument. When STASHwork is used directly (an “intercepted” diagnostic), the full vertical domain
is assumed, and no compression is allowed. For diagnostics copied from temporary workspace to STASHwork,
vertical compression is allowed. Vertical compression may be done by the copying routine or within the PCR.
For climate runs, all diagnostics that cannot be derived from other quantities must be held in the dumps (pro-
duced at specified intervals), where the automatic meaning programs can access them. For operational forecast
use, however, there is a requirement to produce post-processing fieldsfiles at more frequent intervals, for diag-
nostics that will not be in the dumps. It was most efficient to produce fieldsfiles for these quantities directly
from the STASH routine. STASH is therefore called not only after the PCRs which perform the integration of the
model governing equations, but also after a set of PCRs dedicated entirely to producing diagnostics.
The above account concentrates on diagnostics since they are arbitrarily variable and so more complicated to
deal with than the regular quantities that are passed between PCRs for the model code to run. In general the
term “diagnostic” is used to refer to all output from a PCR. This may even include the primary model variables if
they are processed by STASH. However the updating of primary fields, or production of field increments, will be
hardwired into each PCR and will not be part of the STASH system.
The “new dynamics” (vn5.0 onwards) and “ENDGame” (vn8.6 onwards) codes incorporate split physics, instead
of the sequential physics updates of the earliest dynamical core. Primary variables are not incremented after
each section, and there is no information to be gained from intercepting them between sections. Instead,
increments to primary variables are calculated, and these, along with ordinary diagnostics, can be requested.
In order to minimise associated lower-level code changes, STASH calls are made at the Atm Step interface,
which means that there is a separate numbered STASHwork array for each section.
Data fields are held sequentially in the main data array D1, with primary fields in STASHcode order, followed by
any diagnostic fields in STASHcode order, and any secondary fields again in order. Here “primary fields” refer
to prognostic and ancillary fields, needed for initial data in the model dump to start a run. “Diagnostic fields”
are diagnostics with time meaning or other time processing such that a copy of their data needs to be held from
timestep to timestep, and also needs to be held in the dump for continuation runs (CRUNs). “Secondary fields”
are fields required throughout the timestep, but derived from initial data and so not needed in the dump.
When the Unified Model is run on multiple processors (MPP), each processing element (PE) carries the sub-
domain values of D1 according to the data decomposition requested for MPP parallel running. I.e. each PE has
its own D1 array which is populated from the input dump using MPP scattering routines, and reconstituted for
an output dump using MPP gathering routines.
The D1 array and its data pointers are hidden from the user above subroutines INITIAL and ATM STEP. The
definition of a prognostic variable has been extended from section zero (atmosphere) to section 33 (free tracers),
section 34 (UKCA chemical species) and beyond, but all are part of the atmosphere model and use the one
STASHmaster A file. Dimensioning of a few data addressing arrays by “4” is the last remnant of when the UM
code could potentially include ocean and wave submodels.
Other models are now developed in seperate code repositories and Atmosphere-ocean-ice coupling is covered
in UMDP-C02 .
The atmosphere model has a “STASHmaster” file (see Appendix C) which defines the characteristics of all
its data fields, including those for various atmospheric chemistry options. The STASHmaster file is linked to
a particular UM version, but the user may add or overwrite records, by editing the copy of “STASHmaster A”
in their branch or working copy. This is how additional prognostics and/or diagnostics are defined. The main
STASHmaster file resides on the UM repository trunk, and is read in during model initialisation. A copy is also
used within the Rose GUI software to enable diagnostics requests to be set up and verified.
Under the Rose suite environment, make any changes directly to the STASHmaster A file at
“rose-meta/um-atmos/HEAD/etc/stash/STASHmaster” in your branch/working copy.
The atmosphere model is divided into many processing sections, with a corresponding call to STASH for each
section in the routine Atm Step. The STASHmaster A file is a massive collection of field records ordered by
corresponding section numbers (0–99), 50 currently used, and, within each section, by item numbers (1–999).
Thus each STASHmaster field record is identified uniquely by the three numbers (internal model id currently
redundant, section no, item no). Table 1 defines sections in use or reserved for the atmosphere model.
The top-level control routine for initialising STASH request processing and the addressing system is STASH -
PROC — the figure below shows how the relevant subroutines are related.
UM_SHELL
____________|_______________________________________
| | |
UM_Submodel_Init STASH_PROC U_MODEL
| _________|______
| | |
| compress_atmos_stashmaster INITIAL
| |
__________________________|______________________ INITCTL
| | | | | | |
RDBASIS | PRELIM OUTPTL INPUTL ADDRES WSTLST
| |
read_atmos_stashmaster PRIMARY
|
TSTMSK PSLEVCOD
|
TSTMSK_UKCA
The subroutines perform tasks as follows:
UM Submodel Init Sets up the arrays which define the submodel partitioning arrangements, i.e. they specify
how many submodels are in the run, and which internal models each submodel contains. (Now redundant)
STASH PROC Carries out the addressing of data fields and processing of STASH requests. The STASH con-
trol and addressing information generated by STASH PROC is held in a number of arrays, including the
STASHlist array, which contains the composite list of individual STASH requests. See Appendix B for de-
tails. These are passed to routine INITIAL via modules and a few remnant COMMON blocks. When this
processing is complete, it is then known which subset of the STASHmaster records are required for the
particular UM experiment. The total number of STASHmaster records (including any user-defined records)
is passed to U MODEL and used to dynamically allocate the STASHwork arrays.
compress atmos stashmaster Called by U MODEL to compress the space taken by the STASHmaster to
only the active records.
INITCTL Is called by INITIAL to transfer the STASH control and addressing information to arrays that are passed
to the STASH system.
Below STASH PROC:
RDBASIS Reads the STASHC file information containing diagnostic requests, and combines this with configu-
ration details and sizes held in SIZES and CNTLATM files.
read atmos stashmaster Opens and reads STASHmaster records into dynamically allocated arrays. Func-
tions EXPPXC and EXPPXI are provided to extract individual character and integer elements (respectively)
of STASHmaster data from the STASHmaster arrays.
PRELIM Generates a preliminary STASHlist array.
OUTPTL Calculates output lengths for diagnostics.
INPUTL Calculates input lengths for diagnostics.
ADDRES Determines addresses for primary and diagnostic data in array D1 and for transient diagnostics in
workspace array STASHwork.
PRIMARY Computes data lengths and addresses for primary fields.
TSTMSK Selects prognostic fields required for the scientific configuration and checks availability of requested
diagnostics. This important routine often needs to be updated when new prognostics or physics options
are added to the model. A sudsidiary routine TSTMSK UKCA is called if the UKCA chemistry and aerosol
scheme is being run, to deal with the specific complexities of its data and diagnostics.
WSTLST Prints STASH control arrays and finalises STASH initialisation.
PSLEVCOD, PSLIMS Interpret any pseudo-level requests.
The computation of data lengths and start addresses within the D1 array for primary fields is carried out by
routine ADDRES. The reconfiguration uses its own code Rcf Address that performs a similar task for the recon-
figured dump addressing.
Each STASHmaster record includes a “version mask” and an “option code”. The version mask is a 20-digit
binary integer string that defines which UM releases this STASH item is available to. The option code is a 30-
digit decimal integer string, which defines the availability of the STASH item within the internal model section
to which it belongs. See Appendix C for details. In general, a given STASH item in a particular section will be
available for certain versions of that section and not for others.
TSTMSK is called — by PRIMARY from ADDRES within the model and Rcf Address within the reconfiguration
— to perform the checking of version masks and option codes, and so determine the list of primary items
required for the particular UM experiment. The same system is used to check availability of diagnostics. A
list of section-versions is read from the SIZES namelist file for TSTMSK to check option codes against. The
sudsidiary routine TSTMSK UKCA is called if the UKCA chemistry and aerosol scheme is being run, to deal
with the specific complexities of its data and diagnostics.
Rcf Address generates the list of primary items for the reconfigured dump, while ADDRES also computes the
data length and start address in D1 for each primary field, taking into account horizontal domains, vertical levels
and pseudo-levels.
In general, the output dump produced by a UM run may contain additional non-primary fields. This will be the
case if any of the diagnostic requests has the dump specified as their output destination. When a dump is
reconfigured, the dump produced will only contain the primary fields from the original dump. Only a continuation
run (CRUN) preserves and uses any diagnostics fields in the output dump from one run when it is read into the
next run in a series.
case of a time series diagnostic request.) Output lengths are calculated by routine OUTPTL. The input length
is in many cases the same as the output length, but there are a number of cases where the two will differ. For
example, the input to some diagnostics (the non-intercepted ones — see section 2) is allowed to be on a subset
of the possible levels rather than on all the levels. Suppose that such a diagnostic is requested a number of
times, each with different (and possibly overlapping) sets of output levels. In such a case, the input levels list
would be the superset of all the output levels, with any overlap levels being counted once only. Each of these
diagnostics would have an input length determined by the number of levels in the superset, and an output length
determined by the number of levels in its particular output list. The same situation can occur for input and output
pseudo levels. Routine INPUTL constructs the supersets and computes the input lengths. INPUTL also deals
with diagnostics for which the input must be on all available pseudo levels; in such a case, the range of pseudo
levels is supplied as a list in the same array that contains the supersets of pseudo levels.
After the diagnostic data lengths have been computed, routine ADDRES performs the addressing computations
for the diagnostic space in D1 and the STASHwork array, including any secondary space which may be required.
This is just a matter of adding up the lengths in the correct sequence to obtain the start address for each data
field. The capability exists for diagnostics to be copies of section 0 prognostics; in such a case, the appropriate
slot in the primary section of D1 is used for the diagnostic.
Sections 33 and 34 are further prognostic sections within the atmosphere STASHmaster file, so that larger
numbers (up to 150 initially) of free tracers and chemical species may be chosen. Section 34 is for the UKCA
chemistry submodel. The addressing code treats sections 33 & 34 (and sections 31, 32, 37 & 38 for lateral
boundary fields) in a similar way to prognostic section 0.
Subroutine STASH is a general top-level service routine called in many locations within the model. It is the top-
level interface to subroutine STWORK, which is called for each STASH variable. At each timestep, subroutine
SETTSCTL is executed to set logical flags, known as STASHflags, to determine first whether certain diagnostic
code can be skipped, and then to select which diagnostics do require processing through STASH during this
timestep.
To copy data into the transient STASHwork arrays, 2 routines are provided:-
COPYDIAG_3D copies selected levels from a 3D array into STASHwork, and
COPYDIAG performs the same function for a simple horizontal field.
In both “new dynamics” and “ENDGame” atmosphere models, transfer of data is simplified because diagnostics
have no halos and there is no need for special treatment of polar rows. However it is preferable to continue
to use the service routines to provide for possible future developments, and to standardise this function in the
code.
For each new STASHmaster record, STASH PROC generates space in the STASHwork array for diagnostics,
and SETTSCTL sets STASHflags as required. Then all that is necessary is to copy the field containing the
new diagnostic into the STASHwork array in the relevant subroutine where diagnostics for that section are being
calculated. Suitable code would look like:-
REAL new_diagnostic_3d(3d_field), & ! new item1
new_diagnostic(2d_field) ! new item2
IF (sf(item1,sect).or.sf(item2,sect)) THEN ! only calculate if requested
CALL calculate_new_diagnostic(new_diagnostic_3d,new_diagnostic,......)
END IF
IF (sf(item1,sect)) THEN
CALL copydiag_3d(stashwork(item1,sect,im_index),new_diagnostic_3d,....)
END IF
IF (sf(item2,sect)) THEN
CALL copydiag(stashwork(item2,sect,im_index),new_diagnostic,...)
END IF
where new items 1 and 2 are in section sect, stashwork is the STASH work array for that section and other vari-
ables are available through the #include “argsts.h” Fortran include file, which includes sf, stlist and other related
information. The existing call to STASH for this section will process data in the STASHwork array without further
intervention. The array “sf” could be replaced with “sf calc” since we should only be interested in instantaneous
extractions rather then time processed STASH requests. See Appendix B.
From the new STASHmaster record, STASH PROC generates space in the D1 array for the new prognostic,
which will normally be in section 0, though sections 31 and 32 are used for lateral boundary conditions (LBCs),
section 33 for free tracers and section 34 for the UKCA chemistry model species.
To add a new prognostic field to the UM Atmosphere model, use the following steps.
First set up a new “jpointer” in SET ATM POINTERS. For a single-level field add the following generic code:-
JFIELD = SI(item_number,sect_no,im_index)
where sect no will be 0 or one of the above, and im index will be 1 (for atmosphere). The item number is the
new prognostic’s STASH code.
For a multi-level field also add the following additional generic code:-
DO lev = 2,n_levels
JFIELD(lev) = JFIELD(lev-1)+off_size
END DO
where n levels is the number of levels of the field, and off size is one of the predefined values u field size, u -
halo size, u off size, theta field size, theta halo size, theta off size, etc., depending on the grid and halos the
field uses.
Then modify UM INDEX A by adding a new entry to A IXPTR, e.g.:-
A_IXPTR(last_item+1) = A_IXPTR(last_item) + N_LEVELS
where N LEVELS is the number of levels of the field added to A IXPTR before this new one.
Increase the length of A SPPTR LEN as follows:-
A_SPPTR_LEN = A_IXPTR(last_item+1) + N_LEVELS
where N LEVELS is the number of levels of the new field.
Also increase the length of A IXPTR LEN in file spindex.h by 1 for each new prognostic, and add the new
“jpointer” to the include files artptra.h, argptra.h, and typptra.h.
A Fortran pointer for the new field is required in atm fields mod:-
REAL, POINTER :: FIELD(:)
and an entry in arg atm fields.h:-
FIELD, &
plus an entry in typ atm fields.h:-
REAL :: FIELD(tdims_s%i_start:tdims_s%i_end, &
tdims_s%j_start:tdims_s%j_end, &
tdims_s%k_start:tdims_s%k_end)
or similar, where tdims s values are for a theta-type field and p,q,u,v,w dims values are also defined, with s
indicating “small” halos and l “large” ones. The number of levels for the field is k end-k start+1..
Finally, in order to have access to the new prognostic in atm step the FIELD needs to be assigned to the
appropriate part of the D1 array in set atm fields.
E.g. field => D1(JField(qdims%k_start) : &
JField(qdims%k_start) + &
field_length(theta_points,extended_halo,qdims%k_end-qdims%k_start+1) -1)
The new prognostic can now be used as needed, such as passing it down to other subroutines from atm step.
The reconfiguration will also cater for new STASHmaster records of primary fields, allocating space in the re-
configured dump, and allowing selection of a number of types of initialisation. For more complicated initialisation
dependent on other variables in the dump, explicit coding will be needed in the reconfiguration.
6 Acronyms
Table 2: Acronyms
STASH related control information is defined in Fortran include file #include “typsts.h” and passed by argument
through #include “argsts.h”. This contains a number of arrays, the most significant being:-
STASHlist STlist(LEN STLIST,TOTITEMS) generated by STASH PROC and holding a single processed STASH
request in each row. All requests which are either active or require workspace are held; there are TOTITEMS
of them. Some are duplicated if they are to be processed in more than one way. The 33 entries in each row of
STlist, grouped by function, are:
• Item number — Section number
• Processing code
• Frequency — Start timestep — End timestep — Period
• Gridpoint code — Weighting code — Input bottom level — Input top level
• Northern row — Southern row — Western column — Eastern column
• Input code — Input length
• Output code — Output length in D1 — Output address in D1 — Output first level — Output last level
• Position of PP header for field in D1 — Pointer for time series
• STASHlist tag
• Input pseudo level list pointer — Output pseudo level list pointer
• Internal model identifier
• Position of item in D1 array for relevant sub model
• Offset for sampling
• Output address in dump — Output length in dump — Output length of a single level on dump
A description of the above entries now follows:
ENTRIES 1 & 2 STASH item (1–999) & section numbers (0–99),
but, in practice, ppxref sections (0–50).
ENTRY 3 Processing code.
Specifies how the diagnostic is to be processed.
0 Not required by STASH, but space required.
1 Replace
2 Accumulate
3 Time mean
4 Append time-series
5 Maximum
6 Minimum
7 Trajectories
ENTRY 4 Frequency (input and output).
Specifies the frequency at which processing occurs, in timesteps. The Rose Gui allows this to be entered
in a variety of units (hours, dump periods, etc.), and STASH PROC converts it to timesteps. Note that
a diagnostic can only be processed at those timesteps at which its model routines are called, and that
processing will occur only when the calling and processing timesteps coincide. E.g. if the LW radiation
PCR is called every 12 timesteps and the processing frequency is set to 8 timesteps (both starting at 0
timesteps), then no processing will occur until 24 timesteps. The Rose Gui should ensure that processing
of diagnostics from a section can only be requested at those timesteps at which that section is called.
A value of -n in entry 4 indicates that processing is required at the timesteps specified in the STASH output
times table STASHtimes(time,n). Entries 5–7 are then ignored.
If Entry 10 = 100 then ENTRY 11 contains vertical level code LBVC If Entry 10 = M then ENTRY 11
indicates last input model level
ENTRIES 12–15 N/S rows, W/E columns.
Define the horizontal domain over which the diagnostic is processed. The GUI offers a number of preset
area options (e.g. 30N to 90N) and also allows the area to be specified in degrees lat/long or row/column
numbers. For a rotated equatorial lat/long grid (usually used for limited area models) the lat/long refers to
the transformed (rotated) coordinate system. If the lat/long specified does not coincide with model points
then the smallest area enclosing that specified will be computed.
ENTRY 16 Input code.
Specifies the source of the input to STASH.
0 Use primary field at D1(SI(item,section,model))
1 Use space at index SI(item,section,model)
-j Use diagnostic at D1(LIST S(20,j))
The STASHmaster item number within each section references output diagnostics in the PCR CALL argu-
ment. If the diagnostic is a primary field then D1(SI(item,section,model)) will be passed across in the PCR
CALL. If the diagnostic is transitory and calculated within workspace then STASHWORK(SI(item,section,model))
will be passed across. SI(i,s,m) will generally point to workspace. If no workspace is required at all then
the SI(i,s,m) will point to STASHWORK(1) to avoid allocating workspace that is not used. Option 0 allows a
primary variable to be processed by STASH. Option -j allows a diagnostic that has already been processed
into D1(LIST S(20,j)) to be reprocessed by STASH.
ENTRY 17 Input length.
The length of the diagnostic before STASH processing. This is computed by STASH PROC.
ENTRY 18 Output code.
Defines the output destination for the diagnostic. Has the following values:
1. Dump store
2. Secondary dump store
-nn. Output to Fortran unit nn
ENTRY 19 Output length.
The length of the diagnostic after STASH processing. This is computed by STASH PROC. Output length
will differ from input length if (for example) the levels on which output is requested are fewer than the levels
on which the diagnostic is input. For MPP jobs, it will be the length of the diagnostic for that PE.
ENTRY 20 Output address.
Start location of the diagnostic field in D1 after STASH processing. Used by STASH to locate diagnostics in
the dump which require reprocessing. In the MPP context, it is the start location in the D1 array containing
the data for that PE.
ENTRY 21 Output first level.
Specifies type of level selection required. Has the following values:
-N Points to levels list N
M Output on a range of model levels starting at level M
ENTRY 22 Output last level.
Definition depends on entry 21. If Entry 21 negative then:
ENTRY 22 is
1. Model levels
2. Pressure levels
3. Heights
4. Theta levels
5. PV levels (potential vorticity)
STTABL(NSTTIMS,NSTTABL)
The tables of processing times (input & output) for diagnostics which have a negative value in entry 4 of the
STASHlist record.
in that list. The array is of type integer, suitable for model level numbers. When the levels are real values, e.g.
pressure (HPa) or theta (K) levels, they are multiplied by 1000, so allowing for an accuracy of 3 decimal places.
SF(0:NITEMS,0:NSECTS)
STASHflags to determine whether this diagnostic is switched on or off this timestep. This is re-evaluated every
timestep in the subroutine SETTSCTL. If SF(0,section)=.false. then there are no STASH requests this timestep
for this section. This will include all STASH requests including output of time processed requests. If a field
is calculated on different timesteps (e.g. radiation timesteps) the time processed field will not coincide with a
timestep where the instantaneous value is valid. Please see sf calc below.
SF CALC(0:NITEMS,0:NSECTS)
STASHflags to determine whether this diagnostic is switched on or off this timestep with any time processing
requests removed. Time processing always has two STASH requests, one being the extraction of instantaneous
value and the other being an output from the time processing field. These do not always coincide so if you need
to know whether a diagnostic needs calculating then this array should probably be used. This is currently used in
radiation but should probably be used in all science sections which need to know whether to output a calculation.
This is re-evaluated every timestep in the subroutine SETTSCTL. If SF CALC(0,section)=.false. then there are
no STASH requests this timestep for this section.
The n internal model dimension has not been removed from these arrays, but is hard-wired to 1 as only the first
dimension, for atmosphere, is now valid.
There is one huge (1.1MB) STASHmaster file for the UM atmosphere model. STASHmaster A consists of a
series of 5-line records, one for each primary or diagnostic field, ordered by section number and item number.
Each record gives a complete definition of the characteristics of the field to which it refers. The STASHmaster A
file is used by the Rose Gui, reconfiguration, and the UM itself. Each released version of the UM has its own
STASHmaster file and users may modify the STASHmaster A file within a branch.
A single STASHmaster A contains both NewDynamics and ENDGame prognostics.
The template for each STASHmaster record is as follows:
#|Model |Sectn | Item |Name |
#|Space |Point | Time | Grid |LevelT|LevelF|LevelL|PseudT|PseudF|PseudL|LevCom|
#| Option Codes | Version Mask | Halo |
#|DataT |DumpP | PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PCA |
#|Rotate| PPFC | USER | LBVC | BLEV | TLEV |RBLEVV| CFLL | CFFF |
Model, Section, Item identify the record.
Model = 1, atmosphere; (2, ocean; 3, slab ocean; 4, wave; disused.)
Section numbers are in the range 0–50. Section 0 is for primary fields, section 33 for free tracer prog-
nostics and section 34 for UKCA tracer prognostics.
Item numbers are in the range 1–999.
Name is a 36-character description of the STASH item. SI units are assumed; otherwise units should be
specified as part of the name.
Space The space code. Specifies the space requirements of a STASH item. If a diagnostic is calculated only
when required by STASH (i.e. under a STASH flag), or is only copied to STASHwork under a STASH flag,
then the space code is 0. The possible values of space code are:
0 Diagnostic field for which space is required only when the diagnostic is requested.
2 Section 0, 33 or 34 items only: primary field available to STASH.
3 Section 0, 33 or 34 items only: primary field unavailable to STASH which is addressed in the dump
and D1. This is the case for fields which are not full horizontal fields, especially those compressed
onto land points only.
7 Non-primary field which points back to a section 0 field.
9 Extra items at the end of D1 for internal fields (i.e. fields required through the timestep, derived from
dump fields), including LBC input fields.
10 Field not held in D1 or the dump, including LBC output fields.
Point Section zero point-back. This is used for non-section 0 items with space code 7. Any such field is a copy
of a primary field. The value of “point” is the section 0 item number of which it is a copy.
Time Time availability code. Specifies at which timesteps the diagnostic or prognostic is available. The following
diagnostic time codes can be selected:
1 Every timestep.
2 Every long wave radiation timestep.
3 Every short wave radiation timestep.
4 Every coupling period.
13 Every convection timestep.
14 Every leaf phenology timestep
15 Every vegetation competition timestep.
16 Every river-routing timestep.
pseudo levels will be output with a pseudo level code of 0. This indicates they are aggregated properties.
6 Item included only for dust aerosol climatology (stash codes 362–367).
7 Item included only for organic carbon fossil fuel climatology (stash 368,369,370).
8 Item included only for delta aerosol climatology (stash code 371).
n5 Orographic roughness or slope indicator.
0 Item not dependent on orographic roughness.
1 Item included when orographic roughness used.
2 Item included when orographic gradients used in GWD.
3 Item included when orographic slope correction used in SW radiation.
4 Item included when using unfiltered orography for horizon angles.
n6 Interface indicator.
0 Item not dependent on interface.
1 Item included only in limited area models.
2 Item included only for models with lower boundary.
n7 Coupling indicator.
0 Item not dependent on coupling.
1 Item included only if OASIS coupling is used.
2 Item currently unconditionally excluded.
3 Item currently included if OASIS coupling is used, with an iceberg calving ancillary.
7 Coupled run with DMS ocean flux, but always excluded. (Redundant)
n8 Extra fields indicator.
0 Item not dependent on extra fields.
1 Item included only for SST anomaly runs.
2 Item included only for decoupled screen temperature.
3 Item included only for deep convective gusts scheme.
4 Item included only for total aerosol runs.
5 Item included only for total aerosol emission runs.
6 Item included only if snow albedo scheme used.
7 Item included only if tke closure scheme used.
8 Item included only if Energy Adjustment Scheme (section 14) used.
9 Item included only if Thunderstorm Electrification Scheme (section 21) used.
n10n9 “Classic” aerosol indicator. (Items 2, 3, 4, 6 and 9 not available if UKCA is using the new emissions
scheme (l ukca new emiss=.TRUE.).
0 Item not dependent on any “Classic” aerosol scheme.
1 Item included only for SO2 sulphur cycle.
2 Item included only for SO2 with surface emissions.
3 Item included only for SO2 with high level emissions.
4 Item included only for SO2 with natural emissions.
5 Item included only for SO2 with DMS cycle.
6 Item included only for SO2 with DMS cycle and emissions.
7 Item included for SO2 with O3 oxidation included.
n30 The option codes for UKCA prognostics and diagnostics are used slightly differently from other sec-
tions to deal with the highly complex requirements of the UKCA chemistry and aerosol scheme and
to allow for future expansion of the code without needing a major redesign. The logic for the UKCA
option codes sits in its own subroutine tstmsk ukca which is located in the UKCA directory.
If the option code is all zeros, the item is always available (to preserve compatibility with other sec-
tions). If the option code is non zero and UKCA is not on then the item is never available.
If UKCA is on then the code first tests the value of n30 to establish whether the item depends on the
chemistry scheme or the aerosol configuration.
n30=0 The availability of this item depends on the chemistry scheme in use. The code then tests the
value of a specific option code depending on the chemistry. This is the list of which option codes are
tested for which chemistry schemes:
• n1 = Only used by Age-or-air tracer, active if n1=1 and L UKCA AGEAIR = .TRUE.
• n2 = BE Tropospheric
• n3 = BE RAQ
• n4 = NR TropIsop
• n5 = NR StratTrop
• n6 = NR Strat
• n7 = NR Offline oxidants
• n8 = BE Offline oxidants
If the checked option code is zero, then the item is not available. If it is 1 then is available. If it is 2
then it is only available when using the extension to chemistry for aerosol modelling. Other possible
values can be considered depending on the user needs; as an example (since vn9.1), if n3=3 then
the item will only be available when the new emission system is turned ON (i.e. if l ukca new emiss
is TRUE).
n30 = 1 The availability of this item depends on the set up of the GLOMAP-mode aerosol scheme in use.
If GLOMAP-mode is off the item is not available. The code then tests the value of a specific option
code depending on the value of i mode setup. If the checked option code is zero, then the item is not
available. If it is 1 then it is available.
• n1 = i mode setup = 1
• n2 = i mode setup = 2
Atmosphere section 36 - Free tracer lateral boundary updating.
n3n2n1 - The tracer number (range 1–150).
n6 Interface indicator for boundary values and tendencies.
1 Item included only in limited area models.
Atmosphere section 37 - UKCA tracer lateral boundary updating.
n3n2n1 - The chemical species number (range 1–150).
n6 Interface indicator for boundary values and tendencies.
1 Item included only in limited area models.
Version Mask. A 20 digit binary code. Each STASH section can have up to 20 versions, and each version uses
some subset of the item numbers in that section. Version 0 of a section is the null version, i.e. that section
is not activated, so none of the STASH items in that section would be available to the run. The UM inputs
define which versions of each section are available. Some sections use i ¡section¿ vn runtime variables
to select the various versions, other only have a single version available. The version mask digits are
numbered from right to left; a 1 in position N implies that this item is available to version N of that section.
A 0 implies that it is not. E.g. a section version of 6A (or 6B or 6C, etc.) would have a version mask of
00000000000000100000
A single UM routine src/control/top level/h vers mod.F90 maintains the links between the runtime code
and the version mask in the STASHmaster, so to check whether a diagnostic is actually available to a
certain code version.
Halo Halo code. Defines size of halo.
1. Single point halo.
2. Extended halo.
3. No Halo.
DataT Data type code: 1 Real, 2 Integer, 3 Logical.
DumpP PP LBPACK code for data in the dump. See UMDP-F03 .
PC1-A Packing accuracy codes. These codes are used to control packing of data within fieldsfiles at output.
Values for each of the output streams are set in the GUI via the NLSTCALL PP namelists (one per output
stream). The types of packing currently allowed are:-
PC1 WGDOS packing profile 1 — Operational output
PC2 WGDOS packing profile 2 — Standard Climate output (OLD)
PC4 WGDOS packing profile 4 — Stratosphere model output
PC5 WGDOS packing profile 5 — Standard Climate packing
PC6 GRIB packing takes a positive value unlike the above WGDOS packing schemes
PCn values are interpreted as follows:-
-99 outputs the fields in an unpacked state, irrespective of the chosen packing profile.
For GRIB packing value indicates number of bits used.
WGDOS fields are packed to an absolute precision of 2**PackCode. The number is usually negative, so
a packing code of -1 gives an accuracy of 0.5 (1/2), -3 an accuracy of 0.125 (1/8) etc.
The packed number is stored in the format: data = offset + scale factor * packed value
The algorithm packs each model row separately. On each model row the offset is set to the minimum
value in that row. Each model row uses a different offset. The scale factor is expressed as a power of
2. E.g. in a temperature row with a maximum of 300K, a minimum of 285K and a packing code of -1, the
offset would be 285 and the scale factor would be 0.5. Therefore, the value 290.2 would be stored as: 285
+ 0.5 * 10 and when decoded the precision constraint means the value would become 290.0.
In addition to the precision, there is a second constraint when choosing a packing code. The total size
of the number used to represent the packed value cannot be more than 32 bits. In practical terms this
means that if you choose a packing code which is too accurate the packing will fail. E.g. to pack 1.0E6
with a packing code of -13 (absolute accuracy of 2**-13=0.00012207) you need 33 bits. This is too high
an accuracy and WGDOS packing will fail. The absolute accuracy you want determines an upper limit on
the packing code and the biggest value you need to pack (after subtracting the offset) provides a lower
limit on the packing code. To check your packing code is safe you need to estimate the maximum range
you are likely to have on a model row. The packing accuracies are ignored if the whole output stream is
set to “not packed”.
Further examples showing how to choose appropriate packing codes are shown in appendix F.
Rotate Rotation code. This is for wind diagnostics. It indicates whether STASH receives the wind components
relative to the rotated model grid or the regular lat/long grid (for ELF models).
0 Data are either non-vectorial or is relative to the model’s grid.
1 Data are passed to STASH relative to the lat/long grid.
PPFC PP field code. This is the field code defining the data to (older) PP graphics applications and is only used
to specify the PP header item (23=lbfc) on output. A list of currently reserved PP field codes is maintained
at the Met Office; if further details are required please contact the UM system team. Example lbfc codes:
0 unspecified for PP applications
1 height
8 pressure
16 temperature
USER Not used.
LBVC PP vertical coordinate type. Defined and documented in UMDP F03, this is the field code defining the
vertical coordinate of an output field to (older) PP graphics applications, i.e. such that there is a common
value for each horizontal field. LBVC is used to specify the PP header item (26=lbvc) on output, but is also
required within STASH to generate other PP header output items, such as (52=blev) in the correct format.
Example lbvc codes:
0 level unspecified
1 height
8 pressure
65 hybrid height (as atmosphere model levels)
128 mean sea level
129 surface level
130 tropopause level
131 maximum wind level
132 freezing level
BLEV,TLEV Not used - set to 0. [Originally defined as base/top level reference codes, but the appropriate PP
output header items are set dynamically in STASH.]
RBLEVV Not used - set to 0. [Originally defined as ref lbvc=LBRVC, but the appropriate PP output header
items are set dynamically in STASH.]
CFLL, CFFF Historical, operational level/field codes used on the GPCS. These are field codes defining the
data to the output process applications and are only used within the UM to specify the PP header
items (33=lblev;32=lbtyp) on output. Specifically these are still required for some operational applications
through the FFREAD utility on the GPCS. For non-operational fields set CFLL=CFFF=0. Set CFLL=8888
for no level or special level; =9999 for surface.
Rules for the Level Compression Flag The value of the level compression flag determines whether the
input to STASH from a field is passed in on all available levels and pseudo levels (flag=0) or on a list of
specified levels (flag=1). The only restrictions on the flag value are:
(1) Any item with a non-zero space code must have a levels flag of zero.
(2) Any item with levels type 3,4,7,8, or 9 (non-model levels) must have a levels flag of 1.
(3) If the levels type is 5 (single level) then the flag is zero if the pseudo-type is zero, and 1 if the pseudo-
type is non-zero.
If the levels type is 1, 2 or 10 (model levels) then the flag may be 0 or 1 (0 for an intercepted diagnostic —
see above).
Users may define their own STASHmaster records which can either supplement the records in the system
STASHmaster or overwrite existing STASHmaster records. By this means users can obtain new diagnostic,
prognostic and ancillary fields, or correct existing STASHmaster records.
It is important to choose a new STASHitem number for any new record, so that conversion routines and post-
processing can deal with a new set of STASHmaster records without confusion. For Met Office users, STASH-
master files for each model version can be found from the UM Homepage under Metadata. There is also a
link to a list of STASH codes reserved for the future versions; requests for new STASHitem numbers should be
made to [email protected].
The user must also provide appropriate changesets to the source code to calculate any new field they define in
this way and interface it to the STASH routines.
Care should be taken when making new records to copy the line formatting precisely as a small error (e.g.
in packing code) can cause a large problem which may not become apparent until the STASH routines are
executed and new diagnostics viewed. See C or below for details.
If a new STASH record is a prognostic the user must specify an initialisation choice in the corresponding Rose
metadata and GUI panel.
The format for any STASHmaster record is shown below. The header sections H1,H2,H3, labels and end-of-file
are already in STASHmaster A. Any line starting with a # is ignored by Rose and the UM. Note the following:
(1) Most integers do not have leading zeros, but the option code and version mask are strings of single
integers and must have all 30 or 20 digits including leading zeros.
(2) The exact spacing of all the entries is most important.
H1| SUB MODEL_NUMBER=1
H2| SUBMODEL_NAME=ATMOS
H3| UM_VERSION=9.2
#
#|Model |Sectn | Item |Name |
#|Space |Point | Time | Grid |LevelT|LevelF|LevelL|PseudT|PseudF|PseudL|LevCom|
#| Option Codes | Version Mask | halo |
#|DataT |DumpP | PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PCA |
#|Rotate| PPFC | USER | LBVC | BLEV | TLEV |RBLEVV| CFLL | CFFF |
#
#=================================================================
#
1| 1 | 16 | 245 |USER DIAGNOSTIC XXXX |
2| 0 | 0 | 1 | 1 | 3 | 10 | 11 | 0 | 0 | 0 | 1 |
3| 000000000000000000000000000000 | 00000000000000000001 | 0 |
4| 1 | 1 | -99 -99 -99 -99 -99 -99 -99 -99 -99 -99 |
5| 0 | 528 | 0 | 8 | 0 | 0 | 0 | 0 | 0 |
#
1| 1 | 16 | 246 |USER DIAGNOSTIC YYYY |
2| 0 | 0 | 1 | 1 | 2 | 10 | 11 | 0 | 0 | 0 | 1 |
3| 000000000000000000000000000000 | 00000000000000000001 | 0 |
4| 1 | 2 | -99 -99 -99 -99 -99 -99 -99 -99 -99 -99 |
5| 0 | 529 | 0 | 8 | 0 | 0 | 0 | 0 | 0 |
#
#=================================================================
#
1| 1 | 18 | 1 |USER DIAGNOSTIC ZZZZ |
2| 7 | 1 | 1 | 1 | 5 | -1 | -1 | 0 | 0 | 0 | 0 |
3| 000000000000000000000000000000 | 00000000000000000001 | 0 |
4| 1 | 2 | -3 -3 -3 -3 -3 -3 -99 -99 -99 -99 |
5| 0 | 8 | 0 | 129 | 0 | 0 | 0 | 9999 | 12 |
#
#=================================================================
#
1| -1 | -1 | -1 |END OF FILE MARK |
2| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
3| 000000000000000000000000000000 | 00000000000000000000 | 0 |
4| 0 | 0 | -99 -99 -99 -99 -99 -99 -99 -99 -99 -99 |
5| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
#
The D1 ADDR array holds a list of information about all objects in the D1 array. A single or a multi-level field
counts as one object. Note that when running in MPP mode (now the default), each processing element (PE)
has its own copy of D1 and, therefore, its own D1 ADDR array. The number of objects in D1 for each submodel
is stored in a one-dimensional array called NO OBJ D1.
Note that the only submodel in the UM is now the Atmosphere one; coupling to other submodels, e.g. Ocean/Sea-
ice, is implemented using OASIS coupling routines.
Information in D1 ADDR is used primarily for gathering and scattering of fields between PE0 and all the other
PEs. These operations occur mainly in the climate meaning, dumping, and fieldsfile output routines.
D1 ADDR includes pointers to the equivalent objects in the STLIST and LOOKUP arrays, and similarly STLIST
information includes a pointer to D1 ADDR (see the D1 ADDR d1 stlist no and d1 lookup ptr elements, and the
STLIST st position in d1 element). Note that D1 ADDR is still dimensioned by submodel=1, whereas STLIST
is not. Therefore, when looping through the STASH list, check that the internal model identities compare. The
following code example shows how the cross-referencing between arrays can be used. It loops through the
STLIST items and selects only those in the atmosphere submodel :
#include "typd1.h" ! Contains header file d1_addr.h
! Get pointer to the partition of D1_ADDR relating to atmosphere
MODNUM=SUBMODEL_FOR_SM(atmos_im)
DO IE=1,TOTITEMS ! Loop over STLIST entries
! Get pointer to element in D1_ADDR array
PTD1=STLIST(st_d1pos,IE)
! Compare internal model ids - ocean items would not match.
IF(STLIST(s_modl,IE).eq.D1_ADDR(d1_imodl,PTD1,MODNUM))THEN
! This STASH item is part of the atmosphere submodel
Accessing information in D1 ADDR
The D1 ADDR array is held within the same super-array as D1 and is carried, with D1, in argd1.h and artd1.h.
The D1 ADDR array has three dimensions:
INTEGER D1 ADDR(D1 LIST LEN,NO OBJ D1 MAX,N SUBMODEL PARTITION)
D1 LIST LEN Number of items of information held about each object.
NO OBJ D1 MAX Number of objects in biggest submodel.
N SUBMODEL PARTITION Number of submodel partitions (the partition number and submodel ident are both
1 from vn7.0 onwards).
Some of the information from the D1 ADDR array is output as a table “Addressing of D1 array”. The
information held about each object in the d1 addr.h header file is as follows:
1. d1 object type: can be set to prognostic=0, diagnostic=1, secondary diagnostics=2 and other=3.
“other” means space code 9 items (such as exner theta levels or dual time ocean diagnostics).
2. d1 imodl: internal model identity number. e.g. 1 for Atmosphere.
3. d1 section: section number.
4. d1 item: item number.
5. d1 address: address in D1. For MPP this refers to the addressing of the local D1.
6. d1 length: length of the local record.
7. d1 grid type: grid type; from STASHmaster file.
8. d1 no levels: number of levels including pseudolevels.
9. d1 stlist no: a pointer to the STASHlist array. Set to −1 for prognostics.
10. d1 lookup ptr: a pointer to the dump header lookup table.
11. d1 north code: address for the northern row; from STASHlist record.
12. d1 south code: address for the southern row; from STASHlist record.
13. d1 east code: address for the eastern column; from STASHlist record.
14. d1 west code: address for the western column; from STASHlist record.
15. d1 gridpoint code: from STASHmaster file.
16. d1 proc no code: from STASHmaster file.
17. d1 halo type: from STASHmaster file.
Unused elements of the array are initialised to −1. These unused elements would include STASH related
information for prognostic items (prognostic items have no STASH entries).
F.1 Background
This chapter makes use of the following notation in the worked examples.
∆ = In 2S (1)
To determine the number of bits required to pack a field, we can write an expression for the accuracy
∆
A= (3)
In
Problem: A data field has a minimum value of -10 and a maximum of 40. Determine the range of packing codes
that can be used to represent this data in a WGDOS packed fields file.
Solution. To determine the most negative packing code (greatest accuracy) we pack with the full 32 bits avail-
able. Therefore, starting from equation 2 we can say
log (∆/I32 )
S = (6)
log (2)
1 50
= log (7)
log (2) 232−1 − 1
= −25.31. (8)
As the packing code must be an integer, and the above expression provides a lower limit, we round to the
nearest larger integer to give a minimum value for S of -25.
To determine the upper bound for a packing code, we require at least one data bit, which corresponds to a 2 bit
signed integer. In this case however all values will be set to either the maximum or minimum value in the field.
log (∆/I2 )
S = (9)
log (2)
1 50
= log (10)
log (2) 22−1 − 1
= 5.6 (11)
As this is an upper bound, we round to the nearest smaller integer to give a upper bound of 5. It is unusual
however to see positive packing codes.
F.2.2 Example 2: Determine number of bits required to pack with a given accuracy
Problem: A data field has a range of values from 5 to 25. We wish to pack with an accuracy of 2.4 × 10−7 (a
packing code of -22). How many bits are required to pack with this accuracy?
Solution: Starting from equation 5 we can say
Round this value up as the number of bits must be an integer, so we require a 28 bit signed integer to store the
data.
Problem: A data field has a range of values from 10 to 100, and we wish to pack with an accuracy of 5 × 10−10
(a packing code of -31). Is this possible?
Solution: As in the above example, we start from equation 5, and substituting the values for range and accuracy
we solve to give
n = 38.4 (15)
and rounding up we would require 39 bits to pack this data. This is greater than the 32 bit limit imposed by the
WGDOS algorithm, so this data can not be packed at this accuracy.