General Data Management Principles
General Data Management Principles
(S. Iona)
– SeaDataNet General Overview
– Metadata Directories
– Data Policy and Data Licence
– Rules for metadata submission to prevent duplication
– Data Transport Formats , Reformatting Tools, Vocabularies
– Quality Control and Flag Scale
2. Metadata Directories Management (S. Iona)
– Introduction
– Management of EDMO, EDMERP
– On line Practice (1 hr)
Afternoon Session
– On line Practice (continuation) (app.45 min)
[email protected] – www.seadatanet.org
3. Management of EDIOS Metadata 2(L. Rickards)
2002-2005
EU-FP5 Sea-Search
2006-2011
EU-FP6 SeaDataNet
2011-2015
EU-FP7 SeaDataNet II
[email protected] – www.seadatanet.org
SeaDataNet developments
An infrastructure with harmonized services, products and
tools:
– Development of common standards :
Vocabularies, Transport formats
– European catalogues with standardised XML ISO-19115
descriptions
– One unique portal to access all data : virtual data centre
– Set of tools to be implemented in each data centre
• MIKADO: generator of XML descriptions of SeaDataNet catalogues
• NEMO: reformatting software to SeaDataNet formats
• Download Manager: downloading software
• ODV: Ocean data view adapted to SeaDataNet needs
• DIVA: for [email protected]
product generation adapted to SeaDataNet needs
– www.seadatanet.org
Background
Version 0: 2006-2007
– Continuation and maintenance of past Sea-Search system :
• the data access needed several different requests to each data centre
• and the data sets were delivered in different formats
• No standardized information
Version 1: 2008-2010
– Setup of the integrated online data service to users :
• networking the distributed data centres,
• unique request to the interconnected data centres
• and the data sets are delivered with a unique format
• Interconnecting and mutually tuning the metadata directories in terms of format,
syntax and semantics e.g
– ISO 19115 metadata standard for all directories
– Common vocabs, EDMERP, EDMO and CSR references in the metadata
descriptions
– CSR, EDIOS still need content upgrade
[email protected] – www.seadatanet.org
6
Background
Version 2: 2010-2011
– Data product services were added to the infrastructurre
– OGC compliant viewing services
– Management of additional data types (EMODNET, Geo-Seas, etc)
SeaDataNet II (2011-2015)
– Metadata directories (only CDI, CSR) extension with OCG-CS-W
components for automatic harvesting from the SDN nodes
– ISO 19130 transport scheme and INSPIRE compliance will be
implemented
[email protected] – www.seadatanet.org
7
Future
Operationally robust and state of the art Pan-European infrastructure
[email protected] – www.seadatanet.org
Discovery and Viewing Services
SeaDataNet portal provides an overview of the Marine organisations in
Europe and their involvement in scientific cruises, data collection, marine
projects.
[email protected] – www.seadatanet.org
Discovery and Viewing Services
6 European catalogues maintained by NOCDs and published at Pan-
European level:
• EDMO : European Directory of Marine Organisations (<2200)
• CSR : Cruise Summary Reports (>31500)
• EDMED : European Directory of Marine Environmental Datasets
(>3000)
• EDMERP : European Directory of Marine Environmental Research
projects (>2500)
• EDIOS : European Directory of Ocean Observing Systems (>270
programmes for the UK alone and many underway for other
European countries)
• CDI : Common Data Index ( >1000000)
[email protected] – www.seadatanet.org
General maintenance workflow & available tools
[email protected] – www.seadatanet.org
EDMO V1 search and retrieval
http://seadatanet.maris2.nl/edmo
[email protected] – www.seadatanet.org
EDMO CMS
http://seadatanet.maris2.nl/vu_organisations/welcome.asp
• Query by data sets (the interface includes time, geographical box search criteria)
• Query by Data Holding Centre
[email protected] – www.seadatanet.org
The EDMERP User Interface
http://seadatanet.maris2.nl/v_edmerp/search.asp
Additional details
Browse list
[email protected] – www.seadatanet.org
EDMERP CMS
•http://seadatanet.maris2.nl/vu_edmerp/welcome.asp
• capability of creation of
sub-accounts for institutes in the NODC’s country,
while the NODC safeguards the quality by having the chief editor role before
publishing
[email protected] – www.seadatanet.org
CSR V1 Query and Retrieval
http://seadata.bsh.de/csr/retrieve/V1_index.html
POGO/Ocean Going RV
database link
EDMO link
Track chart
[email protected] – www.seadatanet.org
CSR V1 CMS for on-line entry
http://seadata.bsh.de/csr/online/V1_index.html
Upload reports
[email protected] – www.seadatanet.org
The EDIOS User Interface
http://seadatanet.maris2.nl/v_edios_v2/search.asp
[email protected] – www.seadatanet.org
Common Data Index – Data Discovery and Access Service
Check Status
Search In RSM
Request
Include in Confirmed
Basket
Results Ready at DC x
SDN
Submit + Authentication
[email protected] – www.seadatanet.org
Data format
SeaDataNet Data Policy History
• Drafted by Project Office, 02/2007
• Reviewed by the Steering Committee
• Validated by the Coordination Group
• Published at April 2007
• Available at:
http://www.seadatanet.org/Data-Access/Data-policy
[email protected] – www.seadatanet.org
21
SeaDataNet Data Policy
• It is derived from the INSPIRE directive for spatial
information taking into account the national rules and the
SeaDataNet users needs.
• Objectives
to serve the scientific community, public organizations,
environmental agencies
to facilitate the data flow through the Transnational Activities
by stating clearly the conditions for submission, access and
use of data, metadata and data-products
[email protected] – www.seadatanet.org
22
SeaDataNet Data Policy
• Links and Framework
SeaDataNet Data Policy is fully compatible with the EU Directives,
International Policies, Laws and Data Principles:
Directive 2003/4/EC of the European Parliament and of the Council of 28 January 2003
on public access to environmental information and repealing Council Directive
90/313/EEC (http://ec.europa.eu/environment/aarhus/index.htm).
INSPIRE Directive for spatial information in the Community (http://inspire.jrc.it/home.html)
IOC Data Policy (http://ioc3.unesco.org/iode/contents.php?id=200)
ICES Data Policy 2006 (https://www.ices.dk/Datacentre/Data_Policy_2006.pdf)
WMO Resolution 40 (Cg-XII; see http://www.nws.noaa.gov/im/wmor40.htm)
Implementation plan for the Global Observing System for Climate in support of the
UNFCCC, 2004; GCOS – 92, WMO/TD No.1219.
Global Earth Observation System of Systems GEOSS 10-Year Implementation Plan
Reference Document (Final Draft) 2005. GEO 204. February 2005.
CLIVAR Initial Implementation Plan, 1998; WCRP No. 103, WMO/TS No. 869, ICPO No.
14. June 1998. [email protected] – www.seadatanet.org
23
Policy for Data Access and Use
• Metadata
free and open access, no registration required
each data centre is obliged to provide the meta-data in standardized format to populate
the catalogue services
• Data and products
visualisation freely available
the general case is free and without restriction (e.g. academic purposes)
however (due to national policies) mandatory user registration is required (using
Single Sign One (SSO) Service)
a “SeaDataNet role” (partner, academic, commercial etc.) is attributed to individual user
using the Authentication, Authorization and Administration (AAA) Service
Each NODC attributes the roles to the users of its of country
Out of the partnership, the roles are assigned by SeaDataNet user-desk
When register, the user must accept the SDN licence agreement
each data centre node delivers data according to the user’s role and its local regulation
each data centre should provide freely the data sets necessary to develop the common
products
[email protected] – www.seadatanet.org
24
SDN License Agreement
• 1. The Licensor grants to the Licensee a non-exclusive and non-transferable licence to retrieve
and use data sets and products from the SeaDatanet service in accordance with this licence.
• 2. Retrieval, by electronic download, and the use of Data Sets is free of charge, unless
otherwise stipulated.
• 3. Regardless of whether the data are quality controlled or not, SeaDataNet and the data source
do not accept any liability for the correctness and/or appropriate interpretation of the data.
Interpretation should follow scientific rules and is always the user’s responsibility. Correct and
appropriate data interpretation is solely the responsibility of data users.
• 4. Users must acknowledge data sources. It is not ethical to publish data without proper
attribution or co-authorship. Any person making substantial use of data must communicate with
the data source prior to publication, and should possibly consider the data source(s) for co-
authorship of published results.
• 5. Data Users should not give to third parties any SeaDataNet data or product without prior
consent from the source Data Centre.
• 6. Data Users must respect any and all restrictions on the use or reproduction of data. The use
or reproduction of data for commercial purpose might require prior written permission from the
data source.
[email protected] – www.seadatanet.org
25
SDN Roles
on BODC Vocabulary Web Server, list C866.
http://seadatanet.maris2.nl/v_bodc_vocab/welcome.aspx
[email protected] – www.seadatanet.org
26
Causes of the duplicates
• RT and DM data sets from operational oceanography
• Data sets from the GTS (real time transmission) with
rounded values and poorly documented profiles
• International Programmes and data exchange/dissemination
• Data insufficiently documented and attributed to two different
sources
• Water sample files including the T,S station with other
parameters
• Data declassified by the Navies with poor meta-data
• …
[email protected] – www.seadatanet.org
27
Why to prevent duplications ?
• Avoid statistical biases in data products
One measurement could be replicated several times!
[email protected] – www.seadatanet.org
28
How to handle duplications ?
• Duplicates checks as applied locally by partners will
be described later on the QC topic
• But, since there are copies of one data set in
several regional databases (ICES), Black Sea
databases, projects (MEDAR), global databases
(WOD05), national databases, etc:
The simplest way to prevent duplication
within SeaDataNet management System is:
partners to submit only their national data
[email protected] – www.seadatanet.org
29
Data reformatting
In general the original formats of the data files cannot be
used in data management
Include incomplete/not standardized meta-data
There is incompatibility with the input format needed by Quality
Control and other processing tools
There is need of a unique format for safeguarding and
exchanging the data sets
Data management format, archiving format and transport
(exchange) format may be not necessarily the same
[email protected] – www.seadatanet.org
30
Sustainability of the archiving format
The archiving format should:
• be independent from the computer (and libraries)
• insure that includes enough meta-data to be processed (eg. Location
and date)
• be compatible and include at least the mandatory fields (meta-data)
requested for the internationally agreed exchange format(s)
• Include additional textual or standardized “history” or “comment” fields
to prevent any loss of information
• Provide similar structure and meta-data for different data type such as
vertical profiles and time series
These are normally followed also for the exchange
formats.
[email protected] – www.seadatanet.org
31
SeaDataNet Data Transport Formats
Data are available from SeaDataNet delivery
services in two ASCII formats and one BINARY:
• ASCII formats for profiles, point series and
trajectories
○ ODV mandatory
○ MEDATLAS optional
• CF-compliant NetCDF BINARY format for gridded
fields and multi-dimensional data types such as
ADCP
[email protected] – www.seadatanet.org
32
SeaDataNet Data Transport Formats
• ASCII formats (ODV, MEDATLAS) have been
modified to carry additional information required
by SeaDataNet:
– provide linkage between data and metadata (CDI
record)
– provide linkage to standardised SeaDataNet
semantic information such as detailed parameter
description
[email protected] – www.seadatanet.org
33
SeaDataNet Data Transport Formats
• NetCDF inplementation in SeaDataNet is based
on the CF standard which is under specification
– Upgrading NetCDF (CF) standard is planned in
cooperation with UNIDATA (USA) and others expert to
make it better suited for SeaDataNet, MyOcean, etc
– Integration of SDN Common Vocabs, CDI reference in
the metadata header
[email protected] – www.seadatanet.org
34
SeaDataNet ODV Format
• SDN ODV (Ocean Data View) format is a spreadsheet — a
collection of rows (comment, column header and data) with
each data row having the same fixed number of columns
• it allows for a semantic header where parameters are listed
that maps to a vocabulary concept in order to avoid
misspelling or misinterpretation
[email protected] – www.seadatanet.org
35
SeaDataNet ODV Format Data Model
[email protected] – www.seadatanet.org
36
SeaDataNet ODV Format Data Model
• It is based on a spreadsheet model with three
types of row
– Comment row
One cell with text starting with //
It is strongly recommended to be enriched comment
rows with usage metadata
– Column header row
contains a label for each column
– Data row
[email protected] – www.seadatanet.org
37
SDN ODV Profile Data Example
[email protected] – www.seadatanet.org
38
SDN ODV Profile Data Example
[email protected] – www.seadatanet.org
39
SDN ODV Profile Data Example
[email protected] – www.seadatanet.org
40
SeaDataNet ODV Format Data Model
• The Column header and the data rows have
three types of column
– Metadata columns (standardized and mandatory)
– Primary variable data columns (value + flag)
– Data columns (value + flag pairs)
[email protected] – www.seadatanet.org
41
SDN ODV Profile Data Example
[email protected] – www.seadatanet.org
42
SDN ODV Profile Data Example
[email protected] – www.seadatanet.org
43
SDN ODV Profile Data Example
[email protected] – www.seadatanet.org
44
SeaDataNet ODV Format
• Profile extensions
– CDI linkage
Addition of two extra metadata columns (LOCAL_CDI_ID and
EDMO_code)
– Semantic mapping
• Structured comment records immediately preceding the ODV
column header record
• First record is ‘//SDN_parameter_mapping’
• Followed by one mapping record for each data column in the
file
[email protected] – www.seadatanet.org
45
SDN ODV Profile Data Example
[email protected] – www.seadatanet.org
46
SeaDataNet ODV Format
• File extension should be .txt (it is required by the DM)
• Field separator is the tab character (not semi-colon) (DM
requirement)
– Further description and other examples at the Data Transport
Format manual at:
http://www.seadatanet.org/Standards-Software/Data-Transport-Form
ats
[email protected] – www.seadatanet.org
47
SeaDataNet MEDATLAS Format
CRUISE HEADER
[email protected] – www.seadatanet.org
50
SeaDataNet MEDATLAS Profile Example
STATION HEADER
[email protected] – www.seadatanet.org
51
SeaDataNet MEDATLAS Profile Example
data
[email protected] – www.seadatanet.org
52
SeaDataNet MEDATLAS Profile Example
STATION HEADER
Semantic mapping
CDI linkage
[email protected] – www.seadatanet.org
53
SeaDataNet MEDATLAS Format
• The local identifier of the station must be unique because it is the
communication link between the portal and the local system
– Concatenation of MEDATLAS station code, EDMO_CODE and station data type.
• MEDATLAS identifiers
Cruise code (unique):
[email protected] – www.seadatanet.org
54
CDI Identifier
[email protected] – www.seadatanet.org
55
NetCDF (CF compliant) data format
• NetCDF is a set of data formats, programming interfaces, and
software libraries that help read and write scientific data files.
• NetCDF files are self documenting. That is, they include the
units of each variable and notes about what it means and how it
was collected
[email protected] – www.seadatanet.org
56
NetCDF data format
• Like most binary formats, the structure of a
netCDF file consists of header information,
followed by the raw data itself.
• The header information includes information
about how many data values have been stored,
what sorts of values they are, and where within
the file the header ends.
• NetCDF fits specifically to store multidimensional
data arrays.
[email protected] – www.seadatanet.org
57
NetCDF data file structure
[email protected] – www.seadatanet.org
58
Data and metadata reformatting tools
[email protected] – www.seadatanet.org
60
Data and metadata reformatting tools
• Med2MedSDN java tool (available under Windows)
• reformats MEDATLAS files to MEDATLAS SeaDataNet
format
• adds the SeaDataNet extensions : LOCAL_CDI_ID and
EDMO_CODE and mapping for parameters
• linked to SeaDataNet vocabularies through Web services
for parameters mapping and for list of EDMO codes
• generates a coupling file for the SeaDataNet download
manager
• Latest Version 1.1.07 and user manual available at:
http://www.seadatanet.org/Standards-Software/Software/Med2MedSDN
[email protected] – www.seadatanet.org
61
Data and metadata reformatting tools
• Med2MedSDN java tool (available under Windows)
• reformats MEDATLAS files to MEDATLAS SeaDataNet
format
• adds the SeaDataNet extensions : LOCAL_CDI_ID and
EDMO_CODE and mapping for parameters
• linked to SeaDataNet vocabularies through Web services
for parameters mapping and for list of EDMO codes
• generates a coupling file for the SeaDataNet download
manager
• Latest Version 1.1.07 and user manual available at:
http://www.seadatanet.org/Standards-Software/Software/Med2MedSDN
[email protected] – www.seadatanet.org
62
SeaDataNet reformatting tools and vocabs
[email protected] – www.seadatanet.org
63
Vocabularies
• At the start of SeaDataNet vocabularies were
poorly managed
• Metadata populated from Sea-Search libraries
– Weak content and technical governance
– Multiple local copies, each slightly different
– Interoperability compromised by this
[email protected] – www.seadatanet.org
SeaDataNet Developments
• Content governance
– Management by individuals replaced by collaborative
discussion groups
• SeaDataNet – the SeaDataNet Technical Task Team
• SeaVoX – SeaDataNet TTT plus international experts
from IODE and academic communities
• Platforms – ICES-led group concerned with platform
code management
• Geo-Seas – partner subgroup in the OGS “Colla”
collaborative environment
[email protected] – www.seadatanet.org
SeaDataNet Developments
• Technical Governance
– Through the NERC Vocabulary Server technology
• Clearly defined master copy of all vocabularies
• Formally versioned with updates published daily
• Every vocabulary and every term represented by a URI that resolves
to a SKOS XML document delivering labels, definitions and mappings
• Clients developed such as the Maris Parameter Thesaurus Browser (
http://seadatanet.maris2.nl/v_bodc_vocab/vocabrelations.aspx?list=P
081
)
[email protected] – www.seadatanet.org
SeaDataNet Developments
• Population
– There are close to 100 vocabularies deemed of
interest to SeaDataNet and Geo-Seas. Used
for:
• Populating metadata fields in EDMED, CSR,
EDIOS and CDI documents
• Tagging parameters in data files
[email protected] – www.seadatanet.org
Vocabularies
Pre-requirement for the use of the SDN reformatting
tools is :
– Preparation of the mapping between the metadata
and :
• SeaDataNet vocabularies : Sea areas, BODC
parameters (PDV), Platform classes, SDN device
categories, etc
– some automatic mapping is already available in NEMO,
MIKADO, Med2MedSDN
• EDMO : Marine organisations
• EDMERP : Marine environmental projects
[email protected] – www.seadatanet.org
68
Growth of the P011 Vocabulary
[email protected] – www.seadatanet.org
Vocabularies for Metadata
List code List Name
C16 SeaDataNet Sea Areas
C77 ICES ROSCOP data types
C174 SeaDataNet CSR ship metadata
C180 IOC country codes
C320 ISO countries
C371 Ten-degree Marsden Squares
C381 Ports Gazetteer
L05 SeaDataNet device categories
L021 SeaDataNet Geospatial Feature Types
L031 SeaDataNet Measurement Periodicity Classes
L051 SeaDataNet sample collector categories
L061 SeaDataNet Platform Classes
L071 SeaDataNet data access mechanisms
L081 SeaDataNet Data Access Restriction Policies
L101 SeaDataNet geographic co-ordinate reference frames
L111 Height and Depth Vertical Co-ordinate Reference Datum
L181 ROSCOP sample quantification units
L201 SeaDataNet measures and qualifier flags
L231 SeaDataNet metadata entities
L241 SeaDataNet data transport formats
L300 MEDATLAS Data Centres
P011 BODC Parameter Usage Vocabulary
P021 BODC Parameter Discovery Vocabulary
P061 BODC data storage units
P081 SeaDataNet Parameter Disciplines
P091 MEDATLAS Parameter Usage Vocabulary
EDMO European Directory of Marine Organizations
EDMERP [email protected]
European marine projects – www.seadatanet.org
70
Vocabularies for Data
[email protected] – www.seadatanet.org
71
Vocabularies Mappings
• Available mappings between different
vocabularies lists are provided by the BODC
Vocabulary Server Mappings Index (C970) at:
h
ttp://seadatanet.maris2.nl/v_bodc_vocab/search.asp?name=(C970)
%20Vocabulary+Server+Mappings+Index&l=C970
http://seadatanet.maris2.nl/v_bodc_vocab/welcome.as
px
fulfill most needs of SeaDataNet partners
• BODC clients at http://vocab.ndg.nerc.ac.uk/ cover
more vocabularies for those interested to go
beyond SeaDataNet
[email protected] – www.seadatanet.org
73
Future Developments
• NETMAR FP7 project
– NERC Vocabulary Server development forms the bulk of one work
package
• V2 available by the end of 2011
– Thesaurus/ontology server as well as a vocabulary server
– SKOS compliant with W3C accepted version
– Mappings to external resources (e.g. GEMET)
– Fully RESTful read and secured write interface with improved API
– Multi-lingual capability
• Vocabulary/term URI addressing will be maintained
• V1 will be maintained until confirmed dead by service monitoring
[email protected] – www.seadatanet.org
Objectives of QC
Good quality research depends on good quality data and
good quality data depends on good quality controls
methods.
[email protected] – www.seadatanet.org
75
QC procedures
• The QC procedures for oceanographic data according to IOC, ICES and EU
recommendations include automatic and visual controls on the data and their
metadata.
• Data measured from the same instrument and coming from the same “cruise”
are organized at the same file, transformed to the same exchange format and
then are subject to a series of quality tests:
• Check of the Format
• Check of the location and date
• Check of the measurements
• The results of the automatic control are added as QC flags to each data value.
• Validation or correction is made manually to the QC flags and NOT to the data.
• In case of uncertainties, the data originator is contacted.
• All QC procedures applied to the data are fully documented by DCs
[email protected] – www.seadatanet.org
76
SEADATANET Quality Flags values (L021)
(Based on IGOSS/UOT/GTSPP & Argo QC flags)
[email protected] – www.seadatanet.org
Format Check
• Detects anomalies like wrong platform codes or
names, parameters name or units, missing
mandatory information like reference to a cruise
or observation system, source laboratory, sensor
type
• No further control should be made before the
correction and validation of the archive format
[email protected] – www.seadatanet.org
78
Automatic Checks of location and date
• For vertical profiles
(CTD, XBT, MBT, Bottle Data, etc)
• duplicate entries within a space-time radius
• date: reasonable date, station date within the begin and end date
of the cruise
• ship velocity between two consecutive stations.
• (e.g., speed > 15 knots (threshold value) means wrong
station date or wrong station location )
• location/shoreline: on land position
• bottom sounding: out of the regional scale, compared with the
reference surroundings
[email protected] – www.seadatanet.org
79
Visual Checks of location and date of cruises
[email protected] – www.seadatanet.org
80
Automatic Checks of location and date
For time series from fixed moorings (Current
Meters, ADCP, Sediment Traps, etc)
• depth checks: less than the bottom depth
• series duration checks: consistence with the start
and end date of the dataset
• duplicate moorings checks
• land position checks
[email protected] – www.seadatanet.org
81
Dublicates Checks
– Conventional techniques
• Algorithms
comparison of the location, time of the measurements
(5 miles, 15 mins in GTSPP)
comparison of the measurements
comparison of extra metadata (platform codes- floats id, … )
• Visualization of ships tracks, transects, …
– Advanced techniques:
• Computation of an electronic signal/Unique data identifier -CRC Tag
(GTSPP report 2002)
• With a more experimental approach giving more weight on some metadata like
platform code, position, time, …
Need of reliable metadata
Keep the most complete data set
[email protected] – www.seadatanet.org
82
Metadata QC results
– According to MEDATLASII QC flag scale
[email protected] – www.seadatanet.org
83
Automatic Checks of measurements
• For vertical profiles and time series
– presence of at least two parameters: vertical/time reference + measurement
– pressure/time must be monotonous increasing
– the profile/time series must not be constant: sensor jammed
– broad range checks: check for extreme regional values compared with the min. and
max. values for the region. The broad range check is performed before the narrow
range check.
– data points below the bottom depth
– spikes detection: usually requires visual inspection. For time series a filter is applied
first to remove the effect of tides and internal waves.
– narrow range check: comparison with pre-existing climatological statistics. Time series
are compared with internal statistics.
– density inversion test: (potential density anomaly, FOFONOF and MILLARD, 1983,
MILLERO and POISSON, 1981)
– Redfield ratio for nutrients: ratio of the oxygen, nitrate and alkalinity (carbonates)
concentration over the phosphate (172, 16 and 122 in Atlantic and Indian ocean,
Takahashi & al)
[email protected] – www.seadatanet.org
84
Broad Range Check
• Regional and depth parameterization in
MEDAR/MEDATLASII
http://www.ifremer.fr/sismer/program/medar/htql/liste_region.htql
[email protected] – www.seadatanet.org
85
Narrow Range Check
• qc flag=2, probably good data, (result
of auto control)
• qc=1 (manually)
• The automatic comparison with
reference climatologies is made by
linearly interpolating the references at
the level of the observation
• Outliers are detected if the data points
differ from the references more than:
[email protected] – www.seadatanet.org
86
Density inversion test, the importance of visual check
[email protected] – www.seadatanet.org
87
Spikes Check
– The test is sensitive to the vertical/time resolution.
– It requires at least 3 consecutive good/acceptable values.
– It requires 2 consecutive at the surface and the bottom.
– The IOC Algorithm to detect the spikes taking into account
the difference in values (for regularly spaced data like CTD):
• |V2-(V3+V1)/2 | - |V1-V3|/2 ) > THRESHOLD VALUE
– For irregularly spaced values (like bottle data) a better
algorithm to detect the spikes, taking into account the
difference in gradients instead the difference in values, is:
• ||(V2-V1)/(P2-P1)-(V3-V1)/(P3-P1)|-|(V3-V1)/(P3-P1)||>THRESHOLD VALUE
[email protected] – www.seadatanet.org
88
Large temperature inversion and gradient tests
• World Ocean Data Centre, NODC Ocean Climate
Laboratory.
• Relying solely to temperature data to quantify the
maximum allowable temperature increase with depth
(inversion) and decrease (excessive gradient) with depth
(0.3 C per m, 0.7 C per m)
[email protected] – www.seadatanet.org
89
Measurements QC results
–According to MEDATLASII qc flag scale
[email protected] – www.seadatanet.org
90
Real Time QC in Operational Oceanography
(such as Argo, GTSPP and GOSUD Programmes of IOC/IODE)
[email protected] – www.seadatanet.org
91
ARGO Real-Time QC on vertical profiles
• The Delayed-Mode QC in Coriolis Data centre for profiles and time series consists of
Visual QC, objective analysis and residual analysis (to correct sensor drift and offsets).
[email protected] – www.seadatanet.org
93
Sea Level Data QC
(Based on EASEAS-RI Project)
• Near Real Time QC (L1) • Delayed Mode-Higher Level
• Detection of strange characters QC
• Wrong assignment of date and hour
• Spike test
•
• Tidal analysis
Outliers
• Gaps • Computation and inspection of
• Constant values detection (stability test)
•
residuals
Filtering to hourly values
• Computation of residuals • extremes
• Delayed Mode QC (L2) • Statistics means
• Detection of strange characters • Comparison with neighbouring
• Wrong assignment of date and hour
•
tide gauges (correlations)
Spike test
• Gaps • Standard Normal Homogeneity
• Constant values detection (stability test) Test
• Interpolation of short gaps and filtering to hourly
values
• EOF Analysis
•
[email protected] – www.seadatanet.org
94
Real Time QC limitations
• The real time qc tests are limited and automatic
due to the requirement of minimal delay to their
distribution.
• After real time QC, visual QC and calibrations
(delayed mode qc) are necessary before data
distribution.
[email protected] – www.seadatanet.org
95
World Ocean Data Centre
• The QC procedures in the WDC, Ocean Climate
Laboratory are summarized in three major parts:
1. Check of the observed level data
• For the construction of the climatology –
processing
2. Interpolation to standard levels
3. Standard level data checks
[email protected] – www.seadatanet.org
96
World Ocean Data Centre
1. Checks of the observed level data
– Format conversion
– Position/date/time check
– Assignment of cruise and cast numbers
– Speed check
– Duplicate profile/cruise checks
– Range checks
– Depth inversion and depth duplication checks
– Large temperature inversion and gradient tests: to quantify the maximum
allowable temperature increase with depth (inversion) and decrease
(excessive gradient) with depth (0.3 C per m, 0.7 C per m)
– Observed level density inversion checks
[email protected] – www.seadatanet.org
97
World Ocean Data Centre
• Regional parameterization of the world ocean in WOD09.
(plus vertical parameterization)
[email protected] – www.seadatanet.org
98
World Ocean Data Centre
2. Interpolation to standard levels
– Modified Reiniger – Ross scheme (Reiniger and Ross, 1968): less
spurious features in regions with large vertical gradients than a 3-point
Lagrangian interpolation.
[email protected] – www.seadatanet.org
102