Hydrognomon Theory English 01 02
Hydrognomon Theory English 01 02
Hydrognomon Theory English 01 02
1
New aggregation routines etc. compatible to the new time step implementation of
time series.
Application of hydro-measurements registration and calculations.
Implementation of discharge sediment discharge curves.
Enrichment with functions such as general nonlinear time series operations
disaggregation, etc.
Multiple time series regression with various optimization options and monthly
changing solution search, variance chart plotting etc.
Improved statistical operations, potential evapotranspiration calculation, etc.
Area rainfall integration, rain level calculation, etc.
Various bug fixes of previous versions.
The methodologies and the algorithms that are used for the data management and data
processing are presented in this manual; and provide the scientific documentation of the
package. Moreover, data plotting standardization is defined. This manual is mostly based on
the corresponding manual of the ODYSSEY program, with number id=676 at ITIAs web site
(http://www.itia.ntua.gr/el/docinfo/676/). The users manual, which presents the analysis
of the program procedures and functions, will be available separately from the FAQ list and
the demonstration videos (see http://www.youtube.com/hydrognomon/).
In general, the systems development was based on:
The ITIA team experience (http://www.itia.ntua.gr/)
Experience gained from the development of previous versions (Christofides, 1998,
Christofides and Kozanis, 2004, Kozanis et al. 2005), as well as from the
development of trial versions (Kozanis et al., 2005)
The database design, that contains time series data; as well as experience and
knowledge gained from the previous design. (Christofides, et al., 2005;
Papakostas, 2004)
Market research on competing software systems
Feedback and malfunctions reports (bugs) by past versions users
Hydrology and Computer Science bibliographical research.
This system does not implement advanced processes, like synthetic time series production
(this function is provided by the Castalia software, which is also available at the ITIAs site),
but simple (typical hydrological) processes, that operate auxiliary. The most important
functions provided by the system, are the following:
Time series transformation to one with a regular time step (it is well known that raw
time series have some irregularity. However, disruptions come up often and have to
be eliminated before further processing.)
Time series integration to larger time step (aggregation), e.g. ten minutes to
hourly time step, hourly to daily time step, daily to monthly time step.
Standard consistency tests like homogeneity test, extreme values test and time
consistency test.
Linear regression between time series, multiple regression, organic correlation and
autocorrelation.
Water balances: lumped rainfall runoff model (This subsystem is known as Zygos.)
Estimation of missing values by means of linear regression, option to introduce a
random term in order to maintain the statistical properties. Time series expansion.
2
Linear operations between time series.
Stage discharge curve estimation by means of statistical methods and expansion
curves using hydraulic equations.
Export discharge time series from stage time series, as well as volume and area time
series from stage time series of reservoirs and lakes.
Evapotranspiration and potential evapotranspiration calculation using analytical or
semi empirical methods.
Expansion of evapotranspiration samples.
Time series sampling, statistical property estimation, statistical parameter adjustment,
statistical predictions, statistical tests and confidence interval estimation. (This
subsystem is known as Pythia.)
Time series analysis of special rainfalls Intensity - Duration - Frequency (IDF)
curve estimation by means of consistent methodologies. (This subsystem is known as
Omvros.).
CAUTION
Hydrognomon is nothing more than a computing tool for time - consuming calculations,
which means that the experienced and critical thinking experts should never be
substituted. In many cases, Hydrognomon users came up with false results and
conclusions because the data and the applied methodology and assumptions were not
taken seriously into account. It should be given special emphasis to the fact that the
presented software package is quite friendly to the user. This can be a dangerous trap. If
the user is not quite familiar with it, this documentation should always be consulted and
the meaning of each used term, like time series, parameters, etc. should be double
checked. Commented [AC1]:
.
3
Figure 1.1: Typical software screen
4
2 Time series data standardization
2.1 Introduction
Time series data standardization is implemented in order to cover every possible need like
raw data measurement and storage, export of processed data and climatic characteristics,
generation of synthetic time series, water balance formation etc. The most important
standardization characteristic is time series time step, which is also called observation time
scale. There are also some other important standardization characteristics presented in this
documentation such as: the parameters type and special time steps characteristics, like the
timestamp start time and various offsets. Finally, there is a flag option that can be used in
order to mark special time series values.
The main time series data types, according to their origin (raw, processed and synthetic), are
presented in Table 2.1. In previous software packages versions, raw data management was
held by Hydrognomon while processed data management was mainly held by Hydrognomon.
On the other hand, synthetic data management was held by Castalia application. Synthetic
data was used in management scenarios of Hydronomeas1 application and could generate
multiple predictions for the same time series. The current Hydrognomon version cannot
manage synthetic data of multiple predictions. However, data can be transferred to
Hydrognomon using Castalia application and copy paste command.
Raw data may contain errors and need reductions, additions, etc. Raw data should be stored to
separate files or data bases and protected in order to be always available to possible check or
re-processing. Processed data, based on raw data may have the same or derivative time scale.
The stored time series of the new Hydrognomon version (version4) are called version 2 time
series. The used standard is defined by openmeteo.org and it can be found at:
http://openmeteo.org/doc/timeseries.html#timeseries-objects (see also the appendix at the end
of this documentation)
1 The following software packages Hydrognomon, Hydronomeas, Hydrogaea and Castalia are projects
of the ITIA research team of NTUA (http://itia.ntua.gr). Hydronomeas is hydro system management
software package. Castalia is used for synthetic time series generation. The derivative synthetic time
series are used by Hydronomeas or Hydrogaea scenarios. Finally, Hydrogaea is a hydrological and
hydrogeological simulation model.
5
Table 2.1: Time series data types
Data type Description
Raw Intact (not processed in any way) measurements taken using conventional
methods or telemetrically totally automated. Raw data are subjected to
consistency control. Before any other process, they are also subjected to
reduction to constant time step time series.
Processed Results of raw data processing, e.g. aggregated time series, mean values,
time series subjected to infilling of missing values and homogeneity
restoration, etc. or time series generated from complex procedures such as
evapotranspiration estimation models or water balances.
Synthetic Synthetic data refer to prediction time series (i.e. values of sizes at periods
that there are no measurements, e.g. using the Castalia application) or
composition time series e.g. that derive from integration of various stations
samples to a single sample.
In case of processed variables or value measuring that is accumulated during a period of time
(e.g. daily measurement of a pluviometer that accumulated the rain during a whole day), the
time stamp does not correspond to a certain instance but to an integration time period. So as
far as the previous example is concerned, the pluviometers time stamp is equal to a time
period of 24 hours or 1440 minutes (The integration time period is usually as long as the
observation time scale). For monthly values (mean maximum temperatures of a month, or
total monthly rainfall), timestamps correspond to monthly periods. The timestamps that refer
to the values of annual runoff, are whole years etc.
It is assumed that: timestamps mark the end of the measuring / integral period for
accumulating variables of time scales up to a day. The natural meaning of this assumption is
given through an example: if rainfall value is recorded as well as relevant date and hour, this
value refers to the relevant value of the preceding period.
As far as variables of monthly time scale (or longer ones) are concerned, timestamp is sifted
by introducing a suitable offset variable (which is called actual offset) of the same length as
the time scale, that finally marks the begging of the interval. For example the total annual
rainfall of the year 2009 is recorded in 1-1-2009, not in 1-1-2010 in order to avoid
misunderstandings. In this case, the measurement refers to a whole year of observation (from
1-1-2009 to 31-12-2009). Actual offsets can be also used in daily time scales or even shorter,
depending on the users judgment.
6
Time stamps presentation
The international standard ISO 8601 is used for time stamps presentation
(http://en.wikipedia.org/wiki/ISO_8601). According to this standard, the presentation follows an order from
the largest time unit to the briefest, e.g. from years to minutes. For example, the registration
of a complete time stamp will have the following form: 2010-06-18T14:40 (a gap can be used
instead of T). In case of monthly time scale, only the year and the month are used (e.g.
2010/06). On annual basis, only the year is used (e.g. 2010) or the hydrological year (e.g.
2009-2010).
7
Table 2.1: Standard time series time steps, possible use
Term Multiplier / Use
time basis
Unknown - No time step.
Unspecified
Five-minute 5 x minutes Raw data from telemetric stations or from
Ten-minute 10 x minutes scanned films of recording equipment.
Hourly 60 x minutes Raw data from recording devices or even
aggregated data e.g. 10 minutes. Used for
detailed models e.g. flood routing.
Daily 1440 x Raw data recorded usually by means of
minutes conventional methods (e.g. pluviometers) or
processed or even synthetic time series. Used
for detailed water balances,
evapotranspiration models etc.
Monthly 1 x month Processed or synthetic time series. Standard
time step for water balance plot, water needs
estimation, water resources management in
general and climatic properties estimation.
Yearly 12 x month Processed or synthetic time series. Used for
Annual water resources management, statistical time
series processing, climatic trends
investigation (like the hydrological
persistence Hurst phenomenon).
Variable - Raw, sporadic measurements or results of
some processes (see discharge export from
level measurements unit 5.4)
Data resulted from seasonal aggregation can be represented using annual time series (e.g.
October March rainfall). Monthly and annual time steps vary because the duration of the
month or the year varies. So, for time series of annual time step, the timespan can be either
365 days or 366 days if the year is leap. Similarly, for monthly time series, the time interval
can be 28, 29, 30 or 31 days, depending on the month and the year (for leap years).
Consequently, for processes that generate aggregated time series, this variance is taken into
account.
8
values from (0,0) to (59,0) e.g. 00:00, 01:00, 02:00 or 00:59, 01:59, etc. Daily time series with
Nominal Offset= (480,0), refer to time stamps 2010-06-18T08:00, 2010-06-19T08:00, etc.
For monthly time series, the Nominal Offset can be only (0,0), while for annual time series
can be (0,0) or even e.g. (0,9) which refers to the hydrological year that begins in October
(. ). For 2, 3, 4 or 6 month time scales as well as for over year time scales, the
offset begins in the 1st of January of 1900. In this way, 3 month periods e.g. January March
or February - April etc. can be represented. Nominal Offset is calculated and used
automatically by the program for every time series, except for annual ones, because of the
difference between the year and the hydrological year. No special settings are required. The
value is set automatically. Being familiar with this parameter can be helpful in case of
aggregation functions usage.
2.2.5 Use of hydrological year - seasonal aggregations
A complete cycle of periodic hydrological fluctuations lasts a period of time, which is called
hydrological year. (Koutsoyiannis, 1997, p. 87). Its application is fundamental in order to
achieve the best statistical independency among annual values. As far as Greece is
concerned, the hydrological year begins in October 1 and ends on September 30 of the next
year. For example 1987-88 refers to the hydrological year that begins in October 1, 1987 and
ends on September 30, 1988. The timestamp is recorded the first day of the hydrological year.
(e.g. 1/10/1987 00:00, see Table 2.4)
In the current Hydrognomon version, any month could be defined as the start time of the
hydrological year. October 1 is quite common as start time of the hydrological year
internationally. July 1 follows. The hydrological years start time is defined by setting a value
from 0 to 11 months to Nominal Offset e.g. for October Nominal Offset is equal to (0,9),
while for July Nominal Offset is (0,6).
Moreover, it is possible to adjust Hydrognomons environment in order to display calendar
tables, results, etc., according to the used hydrological year. The 1st of October has been set as
default value for start time of the hydrological year. However, the settings can be altered
according to users will.
Annual rates may have resulted from aggregation. Using aggregation to calculate seasonal
rates means that only some months of a year are used (e.g. March August seasonal
aggregation). In this case, the annual representation is just informative and the values refer to
a time period shorter than a year.
9
In case of non-strict time step (Nominal Offset is not set), the time step can be considered to
be a more informative term. So, daily rates can be successively: 2010-06-18T08:00, 2010-06-
19T10:00, 2010-06-20T08:00, 2010-06-21T08:00. The second value is not recorded at 08:00
but at 10:00 a.m. This could happen for example because the observer was 2 hours late but
was conscientious enough to mention it. This is an unpredictable time irregularity. Applying
an appropriate method, that usually uses some kind of linear interpolation, these time
irregularities can be removed (see 4.2).
For time series that have strict time step by definition, the use of a time subdivision that is not
multiple of the Nominal Offset, is prevented by the system. Having in mind that Nominal
Offset is calculated automatically, the first registration is defined automatically by Nominal
Offset (except for annual time series) and the subsequent registrations should be multiples of
it.
Table 2.3 represents the variable types of processed data time series that may result from
aggregation (see unit 4.3). In addition, aggregated variables refer to raw data (like rainfall,
sunshine duration etc.)
Table 2.4 represents various timestamp examples for time series data. The saving
methodology is modified according to the variable type (i.e. differs if the variable is
instantaneous or not)
10
Table 2.4: Timestamps variables time reference for some typical occasions
Variable type Time step Nominal Actual Duration Example
Offset Offset
Instantaneous Every time step 0 - The exact The stage recording
(refers to a specific (cannot measurement of 2005-09-
instance) be instance 06T00:10 is the
defined) recorded value at
00:10, on September
6, 2005
5minute, 10minute, 0 0 The time period is The daily rainfall
hourly, daily and equal to the time step, recorded between
various time steps but it is recorder at 08:01 on
from a minute to a the end of the period. 17,1973 and 08:00
day on November 18, is
represented as:
1973-11-18T08:00
Monthly 0 1 month The specific month. The monthly rainfall
(by The value is recorded of October 1981 is
Every variable that
default) at the begging of the represented as
refers to a period of
month 1981/10 or
time (average,
1981-10-0100:00
cumulated,
Annual 0 12 The specific year. The annual rainfall
maximum,
months The value is recorded of the year 1968 is
minimum)
(by on January 1 represented as 1968
default) or
1968-01-01T00:00
Annual (Typical 9 months 12 The specific The rainfall of the
Greek Hydrological months hydrological year. hydrological year
year) (by The value is recorded 1989-90 is
default) on October 1 represented as 1989-
90 or
1989-10-0100:00
The applied methodology for timestamp saving, can be confusing, especially for daily time
series. However, the natural aggregation processing (time integration) is presented in a better
way.
As mentioned above, every instance is represented vice versa (ISO 8601). The year is
written at the beginning and the minutes at the end. For example the second hour a.m. on
September 6, 2005 is recorded as: 2005-09-06T02:00. The separation symbols (-, T, :) can be
modified or even omitted. This inversion favors the chronological classification.
11
the instance is depicted while in the position of the value is not an empty cell.
,
.
NULLs should not be confused with zero values, since zero can be a measured value while
NULL refers to not existing measurement. For example, in rainfall tables, the gaps should be
clear if they refer to no rainfall (zero value) or to lack of measurement. Such confusion could
provoke incorrect processing, so users should be quite careful.
2.3.2 Time zone
Time zone of timestamps is an escorting time series property, that is represented as
(UTC+HHMM). XXX is a characteristic made of three letters e.g. EET for Athens
winter time or CET for central Europes winter time. The sign + may also be and shows
the time deviation from UTC in hours and minutes, examples: EET(UTC+0200),
CET(UTC+0100). Time zone is just an informative element and its value is not used for
calculations by Hydrognomon. A future Hydrognomon improvement could be a conversion
tool from one time zone to another. Standard time zone is usually preferred that refers to
winter months and not to Daylight Saving time known as summer time. There is a list of
default time zones that Hydrognomon adjusts automatically according to systems properties.
However, every other time zone can be used in the form of string.
12
2.5 Flagging
Table 2.5: Time series data flags compatible with previous Hydrognomon versions.
Flag Use
RANGE The value is out of the borders
SPATIAL The value does not have space consistency
TEMPORAL The value does not have time consistency
INTERNAL The value does not have inner consistency
ESTIMATED The value is estimated
SUSPECT The value is suspect
SNOW Snow at the moment of the measurement
ICE Ice at the moment of the measurement
FROST Frost at the moment of the measurement
DIVE Flooded station at the moment of the
measurement
SPILL Overflowed station at the moment of the
measurement
PUMP ( )
LOGOVERRUN, Logger flags of Delta-T
LOGNOISY,
LOGOUTSIDE,
LOGRANGE
HOMOGEN The value resulted from homogenization
INFILLING The value resulted from infilling
PENMAN The value is calculated using Penman
method
DATEINSERT Raw data did not contain this date. It was
created after time series conversion to stable
time step.
INCONSISTENT The value is inconsistent
AUTO The value has been produced automatically
MISSING The value comes from time series of shorter
time step, and some values were missing.
The value was produced using the other
values (by aggregation, mean value etc.).
Special properties of time series values and NULLs are represented as two digit (on off)
flags. The proposed flags (compatible with past versions) are represented in Table 2.5. Many
of them refer to a specific data logger (Delta-T), but they can be used according to users will.
The flags of the table below refer either to raw data (e.g. DIVE, SPILL, SNOW, etc.) or
processed ones (e.g. TEMPORAL, RANGE, MISSING, DATEINSERT, etc.).
In the new Hydrognomon version, flags can be defined at will, using names written in every
language. The flags presented in Table 2.5 are compatible with previous versions and they are
proposed for better data standardization.
13
Every registration can be flagged using one or more than one flags. Flags are saved as strings
and different flags are separated using the white space. If at the flag region of a registration,
there is a series of symbols like the following: FlagA FlagB , it means that two flags have
been removed: FlagA and FlagB. The sequence does not matter at all. The removal can be
done by typing the flags names divided by white spaces at the flags region or automatically
using the Set Flags window.
In case of raw data, flagging can be done automatically by the logger through the data transfer
software.
14