GeospatialDataQualityGuideENVFinal Rev1
GeospatialDataQualityGuideENVFinal Rev1
net/publication/291335624
CITATIONS READS
3 5,966
3 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Yvan Bédard on 21 January 2016.
Welcome to
the Webinar!
Reminder
Please put your mobile devices on
mute during the presentation.
3
Other registrants:
CGDI Webinar: Guide to Geospatial Data Quality • Netherlands
• Sweden
• Ireland
• Spain
• USA
• Ukraine
• Nigeria
• Uruguay
• Ecuador
• Spain
1 1
10
2 2
5 6
58
4
Registrants in Canada: 92
Today’s program:
GeoConnections
The GeoConnections program is a national initiative, led by Natural
Resources Canada, designed to facilitate access to and use of authoritative
geospatial information in Canada. GeoConnections supports the integration
and use of the Canadian Geospatial Data Infrastructure (CGDI).
GeoConnections: Objectives
Guest Speaker
Dr. Yvan Bédard is a consultant in research management and the
Strategic/Scientific Senior Advisor of Intelli3, a private company
specializing in Geo-BI and GeoData Analysis. He was a full professor
and successful researcher for 28 years at Université Laval Department
of Geomatics Sciences. He was the founding-director of the Centre for
Research in Geomatics and a member of several strategic and scientific
committees in Canada. His expertise is in GIS/Spatial
Databases/Business Intelligence and Analytics as well as in the most
recent spatial data quality issues. He holds a B.Sc.A. in Surveying, a
M.Sc. in Geodesy, and a Ph.D. in Civil Engineering while he's been
involved in computer sciences for over 35 years.
A Guide to Geospatial Data Quality
15
Guide Outline
Introduction
Section Overview
B2B B2C
C2C
The roles and responsibilities of producers, distributors and
users of geospatial data and services are evolving
22
23
23
24
24
25
DEFINING NEEDS/REQUIREMENTS
(ISO 19131)
(ISO TC 145/ISO 3864-2)
PROPERLY WITH ALL
Selecting/Producing a dataset
COMMUNICATING
REVIEWING AS
MONITORING
NECESSARY
PLAYERS
dataset
accepted?
No
Yes
MANAGING RISK of
inappropriate use of geospatial
data (ISO 31000)
25
26
Dataset level
Subset level
(Adapted from
Devillers, 2004)
Feature instance level
Feature attribute level
Attribute value level
39
Overall Risk to
Quality manage
42
STEPS TO ENSURE
DISSEMINATING QUALITY
QUALITY - Definition DOCUMENTING
INFORMATION - Implementation
- Control
EVALUATING
49
DEFINING NEEDS/REQUIREMENTS
(ISO 19131)
(ISO TC 145/ISO 3864-2)
PROPERLY WITH ALL
Selecting/Producing a dataset
COMMUNICATING
REVIEWING AS
MONITORING
NECESSARY
PLAYERS
dataset
accepted?
No
Yes
MANAGING RISK of
inappropriate use of geospatial
data (ISO 31000)
50
Risk Management
51
New/Update Design
OR
Application schema
(ISO 19109)
Schema for coverage
geometry and functions (…)
Feature catalog (ISO 19123)
(ISO 19110)
Quality
evaluation/control
(ISO 19157)
Product documentation:
User manual,
(…) Well-documented data or
Warnings, and other
quality-aware application
user-centered
documentation
54
Can be:
Formal Semi-formal Informal
(ISO 19157) (text, 5-star
rating, etc.)
Experts Non-
experts
55
Against
Data quality units Informal (inspired by ISO 19157)
specifications/ Against
requirements needs
Data quality measures Informal quality evaluation (inspired
(ISO 19131,
used to by ISO
formally Data quality evaluation 19131)
describe the procedures
needs)
Evaluate (ISO 19157)
DEFINING NEEDS/REQUIREMENTS
(ISO 19131)
(ISO TC 145/ISO 3864-2)
PROPERLY WITH ALL
Selecting/Producing a dataset
COMMUNICATING
REVIEWING AS
MONITORING
NECESSARY
PLAYERS
dataset
accepted?
No
Yes
MANAGING RISK of
inappropriate use of geospatial
data (ISO 31000)
57
Efforts Unexpected
negative
impacts
58
ASSESSING RISK
COMMUNICATING MONITORING +
- Identification
PROPERLY TO ALL REVIEWING AS
- Analysis
PLAYERS NECESSARY
- Evaluation
BUILDING RISK
RESPONSES
- Mitigation
- Avoidance
- Transfer/sharing
- Acceptation
59
Risk Management
60
N.B. the complementary IEC 31010 standard provides guidance on risk assessment
techniques
65
N.B. the complementary IEC 31010 standard provides guidance on risk assessment
techniques
66
Example:
Road network model (extract from
Levesque, Bédard, Gervais, & Devillers,
2007)
Warning symbols to
highlight potential
problems
72
Example:
The CanVec+ Feature
Catalog (extract)
Add a section to
describe warnings
http://ftp2.cits.rncan.gc.ca/pub/canvec/doc/CanVec_feature_catalogue_en.pdf
73
DQ_LogicalConsistency of
AreaOfInteres t Image + interpolationType : CV_InterpolationMethod = neares tNei ghbor
+ domainExtent : EX_Extent
1 + rangeType : RecordType
Example: +element
Band Pixel
2004)
+source
1
CV_GridValues Matrix
(f ro m Ele v ation )
1 + gridRange : CV_GridRange
DNValue + val ues : Sequence<Record>
+ sequencingRule : CV_SequenceRule
+ startSequence : CV_GridCoordinate
73
74
Section in the
ISO 19157: verify the specifications to
DQ_LogicalConsistency describe the expected
quality
of the schema
Example:
The CanVec+ Data Product
Specifications (extract)
http://ftp2.cits.rncan.gc.ca/pub/canvec+/doc/CanVec+_product_specifications.pdf
74
75
constraints may help in • Inter-theme (e.g., “dam” can share geometry with “road”)
controlling all
DQ_Elements
Example:
Constraint repository
(adapted from Normand,
1999)
75
76
Detailed steps
76
77
DQ_Elements
Example:
The CanVec+ 082C
metadata (extract)
Data quality
elements in metadata
77
78
ISO 19115 metadata already contains part of this information, but in a technical
jargon usually unintelligible for most users (see Gervais 2004 for the
correspondence)
79
Example of a Warnings and safety section of a user manual (extract from Gervais,
2004)
ISO 19157: values for
DQ_Elements will
influence the product
documentation Example of a Troubleshooting section of a user manual (extract from Gervais, 2004)
DQ_Elements will
influence the product
documentation
Example:
Use of symbols to facilitate
the reading of a quality report
(private report by Gervais,
Bédard and Larrivée, 2007)
82
Example of a context-sensitive
warning of inconsistency after a
query in a quality-aware application
(Gervais et al., 2009). The warning
contains 3 parts recommended by
ISO: level of risk, nature of problem,
action to solve problem
ISO 19157: values for
DQ_Elements will
dictate how the product
should be used
Against
Data quality units Informal (inspired by ISO 19157)
specifications/ Against
requirements needs
Data quality measures Informal quality evaluation (inspired
(ISO 19131,
used to by ISO
formally Data quality evaluation 19131)
describe the procedures
needs)
Evaluate (ISO 19157)
Examples:
Quality unit 1:
MD_Scope: dataset
DQ_Elements: DQ_LogicalConsistency,
DQ_Completeness
Quality unit 2:
MD_Scope: feature type (hydrant)
DQ_Element:
DQ_QuantitativeAttributeAccuracy
• DQ_FullInspection
• DQ_SampleBasedInspection
• DQ_IndirectEvaluation
Example:
Area guided non-random sampling method
X X X
X X X X X X X
X X X X X X X
X X X
See 19157:2013 for more examples
87
Examples:
DQ_QuantitativeResult
(DQ_CompletenessCommission), Number of excess items: 3
DQ_ConformanceResult
(DQ_CompletenessCommission), Number of excess items: pass
DQ_DescriptiveResult
(DQ_LogicalConsistency), Conceptual schema compliance: “The rules of the CanVec+ conceptual schema are
all recorded and validated in the source database containing the CanVec+ product. This approach ensures the
conceptual consistency between the conceptual schema and the CanVec+ product.” (from CanVec+ 082C
metadata (extract))
Example:
• For consumers in the B2C and C2C context: various means are
used, unknowingly following ISO 19157 rationale in a less
rigorous manner. They typically fit their measurement method
with their quality representation method.
Examples:
Use of a 5-star rating
system to rate the
representation of
buildings in a virtual
globe environment
(from (Jones, 2011))
ASSESSING RISK
COMMUNICATING MONITORING +
- Identification
PROPERLY TO ALL REVIEWING AS
- Analysis
PLAYERS NECESSARY
- Evaluation
BUILDING RISK
RESPONSES
- Mitigation
- Avoidance
- Transfer/sharing
- Acceptation
93
Feature
Identified
Example:
/ Impact of Risk Probability of Occurrence
Risks
Attribute
Analysis of Cultivated
Strong overestimation of the
ratio quantity of pesticide / Medium
potential risks parcel
R-1
hectare (high)
of inappropriate Strong underestimation of the
use of Floodplain R-2 quantity of pesticides that might Medium
geospatial data be present in water (high)
Feature Probability
Identified Overall Risk
Example:
/ Impact of Risk of
Risks Evaluation
Attribute Occurrence
Evaluation of Cultivated
Strong overestimation of
the ratio quantity of Medium High
potential risks of parcel
R-1
pesticide / hectare (high)
inappropriate Strong underestimation of
use of geospatial the quantity of pesticides
Floodplain R-2 Medium Medium
data (extract that might be present in
water (high)
from Grira,
Strong underestimation of
2014)
the quantity of pesticides
Pesticide
R-3 that might be present in Medium Medium
spread area
water (high)
97
Examples:
• Improve database design/dataset structure
• Improve the quality control of the dataset (e.g., add integrity constraints)
• Use standards (e.g., for data quality and risk management interoperability)
• Properly inform users in a language they understand (highly recommended)
• Provide a user manual
• Offer a 1-800 help line or [email protected]
• List target usages and non-recommended usages
• Provide a Guide of good practices
• Train users, …
• Conduct tests on the dataset (users)
• Compare with another dataset (users)
• (…)
98
Examples:
• Stop distributing or using the dataset or a part of
• Eliminate a category of users
• Eliminate a data provider
• Explicitly and clearly forbid a given usage
• (…)
100
Examples:
• Buy an insurance
• Obtain the dataset from a broker who can give advice related to its use
• Use a dataset with a guarantee that explains clearly risk sharing (who is responsible of what)
• The content of a guarantee for geospatial products is described in a paper by Plante and
Gervais to be published in Geomatica in 2015.
• Have the dataset evaluated by an expert
• Replace a B2C strategy with a B2B strategy for your business by contracting a data broker who
will offer the B2C strategy
• Have the data quality evaluated by an external expert for the new usages
• (…)
101
Example:
• Use the dataset no matter what the risks are and do nothing about it, i.e. take
the risk
102
Examples:
Quality metadata (the
CanVec+ 082C
metadata (extract))
Recommendations
Section Overview
Conclusions
As geospatial data is increasingly being produced and (re)used by
new types of actors, the question of geospatial data quality is
becoming a major concern
The objective of this guide was to support the Canadian geospatial
community into its efforts to make the spatially-enabled society more
aware of geospatial data quality
Based on international standards such as ISO 19157 (Geospatial data
quality) and ISO 31000 (Risk management), this guide presented:
The concepts underlying geospatial data quality
The management of geospatial data quality
The geospatial data quality evaluation process in details (based on
ISO 19157)
The management of risks of inappropriate use of geospatial data
(based on ISO 31000)
Detailed examples of quality evaluation and risk management
tasks to be undertaken in the B2B, B2C and C2C contexts
Question and Answer Session
How to find the CGDI
Resource Centre
114
http://www.nrcan.gc.ca/earth-sciences/geomatics/canadas-
spatial-data-infrastructure/8906
115
Thank you!
Mr. Eric Wright
Geomatics Engineer, CCEO/GeoConnections
[email protected]
QUESTIONS?