Chapter 4
Chapter 4
Chapter 4
GIS and Remote Sensing
Bahir Dar University
Tekeste D.
1
Data Quality
GiGo: garbage in, garbage out
‘Cos it’s in the computer, don’t mean it’s right
It’s not the things you don’t know that matter, it’s the
things you know that aren’t so.
Will Rogers, Famous Okie GI specialist
“But there are also unknown unknowns: the ones we
don't know we don't know.” Donald Rumsfeld
2
Horwood’s Short Laws on Data
Dr. Edgar Horwood, founder of the Urban and Regional Information Systems Association (URISA) and
Professor of Civil Engineering and Urban Planning at the University of Washington was an early
pioneer of computer mapping in the early 1960s.
Tekeste D.
4
Data Quality: How good is your data?
• Scale
– ratio of distance on a map to the equivalent distance on the earth's surface
– Primarily an output issue; at what scale do I wish to display?
• Precision or Resolution
– the exactness of measurement or description
– Determined by input; can output at lower (but not higher) resolution
• Accuracy
– the degree of correspondence between data and the real world
– Fundamentally controlled by the quality of the input
• Lineage
– The original sources for the data and the processing steps it has undergone
• Currency
– the degree to which data represents the world at the present moment in time
• Documentation or Metadata
– data about data: recording all of the above
• Standards
– Common or “agreed-to” ways of doing things
– Data built to standards is more valuable since it’s more easily shareable
5
Scale
– ratio of distance on a map, to the equivalent distance on the earth's surface.
• Large scale -->large detail, small area covered (1”=200’ or 1:2,400)
• Small scale -->small detail, large area (1:250,000)
• A given object (e.g. land parcel) appears larger on a large scale map
– scale can never be constant everywhere on a map ‘cos of map projection
• problem is worst for small scale maps & certain projections (e.g. mercator)
• can be true from a single point to everywhere
• can be true along a line , or a set of lines
• on large scale maps, adjustments often made to achieve ‘close to true’ scale
everywhere (e.g State Plane and UTM systems)
– scale representation
• Verbal: (good for interpretation.) 0ne inch each equals one statute mile
• representative fraction (RF) 1: 63,360
use them all (good for measurement)
on a map!
(smaller fraction=smaller scale:
1:2,000,000 smaller than 1:2,000)
• scale bar: 0 1 2
(good if enlarged/reduced)
Miles
6
Scale Examples
Common Scales Large versus Small
1:200 (1”=16.8ft) large: above 1:12,500
1:2,000 (1”=56 yards; 1cm=20m) medium: 1:13,000 - 1:126,720
1:20,000 (5cm=1km) small: 1:130,000 - 1:1,000,000
1:24,000 (1”=2,000ft) very small: below 1:1,000,000
1:25,000 (1cm=.5km)
( really, relative to what’s available
1:50,000 (2cm=1km) for a given area; Maling 1989)
1:62,500 (1.6cm=1km; 1”=.986mi)
Map sheet examples:
1:63,360 (1”=1mile; 1cm=.634km)
1:24,000: 7.5 minute USGS Quads
1:100,000 (1”=1.58mi; 1cm=1km)
1:500,000 (1”=7.9mi; 1cm=5km) (17 by 22 inches; 6 by 8 miles)
1:1,000,000(1”=15.8mi; 1cm=10km) 1:7,500,000 US wall map
1:7,500,000(1”=118mi); 1cm=750km) (26 by 16 inches)
1:20,000,000: US 8.5” X 11”
7
Scale, Resolution & Accuracy in GIS Systems
• On paper maps, scale is hard to change, thus it generally determines resolution
and accuracy--and consistent decisions are made for these.
• A GIS is scale independent since output can be produced at any scale,
irrespective of the characteristics of the input data— at least in theory
• in practice, an implicit range of scales or maximum scale for anticipated output
should be chosen and used to determine:
– what features to show
• manholes only on large scale maps
– how features will be represented
• manhole a polygon at 1:50; cities a point at 1:1,000,000
– appropriate levels for accuracy and precision
• Larger scale generally requires greater resolution
• Larger scale necessitates a higher level of accuracy
• GIS also helps with the the generalization problem implicit in paper maps
– A road drawn with 0.5 mm wide line (the smallest for decent visibility)
• At 1:24,000 implies the road is 12 meters (36 feet) wide
• At 1:250,000 implies the road is 125 meters (375 feet) wide
– At least in a GIS you can store the true road width, but be careful with plots!
8
8
Precision or Resolution
it’s not the same as scale or accuracy!
Precision: the exactness of measurement or description
• the “size” of the “smallest” feature which can be displayed, recognized, or described
• Can apply to space, time (e.g. daily versus annual), or attribute (douglas fir v. conifer)
• for raster data, it is the size of the pixel (resolution)
3.2ft
– e.g. for NTGISC digital orthos is 1.6ft (half meter)
• raster data can be resampled by combining adjacent cells;
this decreases resolution but saves storage 1.6ft
3.2ft
– eg 1.6 ft to 3.2 ft (1/4 storage); to 6.4 ft (1/16 storage)
• resolution and scale
– generally, increasing to larger scale allows features to be observed better and requires
higher resolution
– but, because of the human eye’s ability to recognize patterns, features in a lower resolution
data set can sometimes be observed better by decreasing the scale
(6.4 ft resolution shown at 1:400 rather than 1:200)
• resolution and positional accuracy
– you can see a feature (resolution), but it may not be in the right place (accuracy)
– higher accuracy generally costs much more to obtain than higher resolution
– accuracy cannot be greater (but may be much less) than resolution (e.g. if pixel size is one
meter, then best accuracy possible is one meter)
9
9
Accuracy: rests on at least four legs, not one!
Positional Accuracy (sometimes called Quantitative accuracy)
Spatial
– horizontal accuracy: distance from true location
– vertical accuracy: difference from true height
Temporal
– Difference from actual time and/or date
Attribute Accuracy or Consistency-- the validity concept in experimental design/stat. inf.
– a feature is what the GIS/map purports it to be
– a railroad is a railroad, and not a road
– A soil sample agrees with the type mapped
Completeness--the reliability concept from experimental design/stat. inf.
– Are all instances of a feature the GIS/map claims to include, in fact, there?
– Partially a function of the criteria for including features: when does a road become a track?
– Simply put, how much data is missing?
Logical Consistency: The presence of contradictory relationships in the database
– Non-Spatial
• Some crimes recorded at place of occurrence, others at place where report taken
• Data for one country is for 2000, for another its for 2001
• Annual data series not taken on same day/month etc. (sometimes called lineage error)
• Data uses different source or estimation technique for different years (again, lineage)
– Spatial
• Overshoots and gaps in road networks or parcel polygons
10
10
Sources of Error
Error is the inverse of accuracy. It is a discrepancy
between the coded and actual values.
Sources Example for Positional Accuracy
• Inherent instability of the • choice of spheroid and datum
phenomena itself • choice of map projection and its
– E.g. Random variation of most parameters
phenomena (e.g. leaf size) • accuracy of measured locations
• Measurement (surveying) of features on earth
– E.g. surveyor or instrument error • media stability (stretching ,folding,
• Model used to represent data wrinkling of maps, photos)
– E.g. choice of spheroid, or • human drafting, digitizing or
classification systems interpretation error
• Data encoding and entry • resolution &/or accuracy of
– E.g. keying or digitizing errors drafting/digitizing equipment
• Data processing – Thinnest visible line: 0.1-0.2 millimeters
– E.g. single versus double – At scale of 1:20,000 = 6.5 - 12.8 feet
precision; algorithms used (20,000 x 0.2 = 4,000mm = 4m = 12.8 feet)
• Propagation or cascading from one • registration accuracy of tics
data set to another • machine precision: coordinate rounding
– E.g. using inaccurate layer as error in storage and manipulation
source for another layer • other unknown
11
11
Measurement of Positional Accuracy
• usually measured by root mean square error: the square root of the average
squared errors
According to chart
In reality
15
15
Summary:
Resolution, Scale, Accuracy & Storage:
illustrating the relationship
Go to quality_graphics.ppt
17
17
Lineage
• identifies the original sources from which the data was
derived
• details the processing steps through which the data has
gone to reach its current form
• Both impact its accuracy
• Both should be in the metadata, and are required by the
Content Standard for Metadata (see below)
• Michael Goodchild ( the guru of GIS) advocates:
– Measurement-based GIS, in which how data collected and how
measurements made are a part of the record (as in surveying)
– Coordinate-based GIS, is the current approach, and it tracks
none of this.
(see Shi, Fisher and Goodchild Spatial Data Quality London: Taylor and Frances, 2002)
18
18
Currency: Is my data “up-to-date”?
• data is always relative to a specific point in time, which must be
documented.
– there are important applications for historical data (e.g. analyzing trends),
so don’t necessarily trash old data
• “current” data requires a specific plan for on-going maintenance
– may be continuous, or at pre-defined points in time.
– otherwise, data becomes outdated very quickly
• currency is not really an independent quality dimension; it is
simply a factor contributing to lack of accuracy regarding
– consistency: some GIS features do not match those in the real world today
– completeness: some real world features are missing from the GIS database
• May address:
– Content (what is recorded)
– Format (how it’s recorded: file format, .tif, shapefile, etc)
• May be a product of:
– An organization’s internal actions [private or organization standards]
– An external government body (Federal Geographic Data Committee) or third
sector body (Open GIS Consortium) [public or de jure standards]
– Laissez-faire market-place-forces leading to one dominant approach e.g. “Wintel
standard” [industry or de facto standards]
http://www.fgdc.gov/standards/standards.html 20
Who Sets Public Standards ?
• Federal Geographic Data Committee
– Sets standards for geospatial data which all federal agencies are
required to follow
– Has representatives from most federal agencies
– National Institute for Standards and Technology (NIST) sets federal
gov. standards for other things (e.g. IT in general)
• national standards bodies
– American National Standards Institute (ANSI)
• has the US’s single vote at ISO
– United States InterNational Committee on Information Technology
Standards (INCITS) handles IT standards for ANSI
– Several FGDC standards been submitted for approval
– Most countries in the world have their equivalent to ANSI
• international standards bodies
– ISO (International Organization for Standardization)
• other assorted vendor groups, professional associations, trade
associations, and consortia
– Open GIS Consortium (OGC) is the main player in GIS
21
The Process for Setting de jure standards!
Source: URISA News
Issue 197, Sept/Oct. 2003
Go to the following web site for excellent overview of standard making: process
http://www.fgdc.gov/publications/documents/standards/geospatial_standards_part1.html 22
Adopting Standards: What you should do
• Data quality achieved by adoption and use of standards: Do it!
– Common ways of doing things essential for using & sharing data internally
and externally
• only federal agencies required to use FGDC standards, its
optional for any others (e.g. state, local)
– power of feds often results in adoption by everybody, although there are
some noted failures (e.g.the OSI, GOSIP, & POSIX standards in computing
in the 1980s failed and were withdrawn)
• FGDC or ISO standards provide excellent starting point for local
standards, and should be adopted unless there are compelling
reasons otherwise
• Standards for metadata (“documenting your data”) are the most
important and should be first priority.
– Content Standard for Digital Geospatial Metadata (version 2.0), FGDC-STD-001-1998
– ISO Document 19115 Geographic Information-Metadata (content) and 19139, Geographic
Information—Metadata—Implementation Specification, (format for storing ISO 19115
metadata in XML format)
– If not one of these standard for metadata, adopt some standard!
23
23
Content Standards for Digital Geospatial Metadata
What and Why?
24
24
Main Sections of the US Federal
Content Standard for Digital Geospatial Metadata
Identification
Title? Area covered? Themes? Currency? Restrictions?
Data Quality (5 aspects)
Positional & Attribute Accuracy? Completeness? Logical Consistency? Lineage?
Spatial Data Organization
Indirect? Vector? Raster? Type of elements? Number?
Spatial Reference
Projection? Grid system? Datum? Coordinate system?
Entity and Attribute Information
Features? Attributes? Attribute values?
Distribution
Distributor? Formats? Media? Online? Price?
Metadata Reference
Metadata currency? Responsible party?
For more info, go to: http://www.fgdc.gov/metadata/contstan.html
By law (Executive Order 12906, 1994), all federal agencies must document their data according to:
Content Standard for Digital Geospatial Metadata (version 2.0), FGDC-STD-001-1998
25
Traditional Minimum Documentation Requirements for Maps/GIS
• geodetic datum name (e.g NAD27)--which implies:
– ellipsoid/spheroid name (earth model) e.g. Clark 1866
– point of origin (ties ellipsoid to earth) e.g Meades Ranch
– required for all GIS data bases and maps
• projection name and its parameters and its measurement units
(see terrestrial lecture for exact details)
– Required for all maps since 2-D by nature
– Required for GIS if data is in X-Y projected form
• Source information
– accuracy standard(s) to which built
– author/publisher/creator name and/or data source
– date(s) of data collection/update, and of map/gis creation
• Cartographers demand all maps have + tic marks: +
Points of positional
– north arrow reference used to
– map scale relate map to ground
– graticule indication or other map
+ +
• at least four latitude/longitude tic marks, with values in degrees
• at least four X-Y tic marks, with values and units of measurement (feet, meters, etc.)
26
Texas Standards
http://www.dir.state.tx.us/tgic/pubs/pubs.htm
• Standards for digital spatial data (raster and vector) for State
agencies in Texas were established in 1992
– http://www.dir.state.tx.us/tgic/pubs/gis-standards.htm
– Currently (2004), being reviewed by the Texas Geographic Information
Council (TGIC) for possible update
– Apply to map scales of 1:24,000 and smaller (e.g., 1:100,000; 1:250,000).
– Cover variety of issues including data layers, datum, projections, accuracy,
metadata, etc..
• Two major planning reports on GIS in state gov. in Texas are:
– Digital Texas: 2002 Biennial Report on Geographic Information
Systems Technology
• http://www.dir.state.tx.us/tgic/pubs/gift99-small.pdf
– Geographic Information Framework for Texas (1999)
• http://www.dir.state.tx.us/tgic/pubs/digtex-lowres.pdf
27
27
Importance of Standards
Great Baltimore Fire of 1904 - fire engines from different
regions responded only to be found useless since they had
different hose coupling sizes that did not fit Baltimore
hydrants - fire burned over 30 hours, resulted in destruction
of 1526 building covering 17 city blocks.
Fire 1923 - Fall River, MA saved when over 20 neighboring
fire department responded to a town fire since they had
standardized on hydrants and hose couplings sizes.
9/11: Response in NY and DC severely hampered by
incompatibilities between GIS data sets, and lack of data
Also, incompatibilities between communications systems
The most important standard?
Railroad track gauge - adopted by US, UK, Canada, and much of
Europe.
South America still hampered by differing railroad gauges between
countries.
28
28
The Best Time
to Adopt a
Standard?
Now? Now?
Before!
29
29
Appendix
FGDC Standards
(status as of March 2004)
For latest, go to:
http://www.fgdc.gov/standards/standards.html
30
30
FGDC: Metadata Standards
Metadata:
• Content Standard for Digital Geospatial Metadata (version 2.0)
FGDC-STD-001-1998
• Content Standard for Digital Geospatial Metadata, Part 1:
Biological Data Profile FGDC-STD-001.1-1999
• Metadata Profile for Shoreline Data (FGDC-STD-001.2-2001)
• Content Standard for Digital Geospatial Metadata: extension for
remote sensing data (FGDC-STD-0012-2002)
• Encoding Standard for Geospatial Metadata (Draft)
• Metadata Profile for Cultural and Demographic Data (dropped)
33
33
FGDC: Framework Data Standards
• establish data content requirements for the seven layers of geospatial data that
comprise the National Spatial Data Infrastructure (NSDI), the base layers
needed for any geographic area
• geodetic control, • Transportation
• elevation, • Cadastral (landownership)
• Orthoimagery • governmental unit boundaries
• Hydrography (water)
• Goals are to
– Facilitate and promote exchange of framework layers between producers,
consumers, and vendors thru a common content and way of describing that content
– Lower the cost of data for everyone
• For each layer, specifies an integrated application schema in Unified Modeling
Language (UML) including feature types, attribute types, attribute domain,
feature relationships, spatial representation, data organization, and metadata
• no standard specified for data format, but an appendix describes a possible
implementation using the Geography Markup Language (GML) Version 3.0,
developed through the Open GIS Consortium, Inc. (OGC).
34
FGDC: Data Transfer Standards
Spatial Data Transfer Standard (SDTS) FGDC-STD-002
SDTS, Part 1 Logical Specification (FIPSPUB 173-1, July 1994)
SDTS, Part 2 Spatial Features (FIPSPUB 173-1, July 1994)
SDTS, Part 3 ISO 8211 Encoding (FIPSPUB 173-1, July 1994)
SDTS, Part 4 Topological Vector Encoding (FIPSPUB 173-1, July 1994)
SDTS, Part 5 Raster Profile and Extensions (FGDC-STD-002.5, 2000)
SDTS, Part 6: Point Profile, FGDC-STD-002.6, 2000
SDTS Part 7: Computer-Aided Design and Drafting (CADD) Profile
(FGDC-STD-002.7, 2000)
36
36
THANKS!
Any questions?
Tekeste D.
37