0% found this document useful (0 votes)
5 views36 pages

Lecture 9 - Data Types and Errors

The document discusses the importance of acquiring geographic data in cartography, noting that data acquisition can consume a significant portion of project resources. It highlights various sources of errors in geographic data, including inaccuracies in measurements and outdated information, and emphasizes the necessity of understanding these errors to ensure the quality of spatial analysis. Additionally, it outlines different types of attribute data and measurement scales, explaining their relevance in GIS analysis.

Uploaded by

HAMO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views36 pages

Lecture 9 - Data Types and Errors

The document discusses the importance of acquiring geographic data in cartography, noting that data acquisition can consume a significant portion of project resources. It highlights various sources of errors in geographic data, including inaccuracies in measurements and outdated information, and emphasizes the necessity of understanding these errors to ensure the quality of spatial analysis. Additionally, it outlines different types of attribute data and measurement scales, explaining their relevance in GIS analysis.

Uploaded by

HAMO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

CGB 213: Principle

of Cartography -
Lecture 9 –
Geographic Data
Errors and Attribute
Data Types.
Introduction
 Acquiring geographic data is crucial in any
Cartographic effort.

 It has been estimated that data acquisition


typically consumes 60 to 80 percent of the
time and money spent on any given project.

 It is important to be aware that geographic


data may carry errors and users should take
note of these.
Geographic Data Errors
 Cartographic data is not perfect. Like any other
data, it can contain errors or inaccuracies that may
affect the final output.
 Some common sources of errors in Cartographic
data include incomplete or outdated data
sources, errors in data entry or conversion,
imprecise or inaccurate measurements, and
inherent limitations of the data collection method.
Geographic Data Errors
 The power of GIS and mapping is in the ability to use many
types of data related to the same geographical area to
perform the analysis, integrating different datasets within a
single system.

 When a new dataset is loaded into a GIS software application,


the software imports not only the data but also the error that
the data contains. The first action to take care of the problem
of error is being aware of it and understanding the limitations of
the data being used.
Geographic Data Errors
 Those who work with Cartographic data should understand that
error, inaccuracy, and imprecision can affect the quality of
many types of Cartographic projects, in the sense that errors that
are not accounted for can turn the analysis in a project into a
pointless exercise.
 Understanding errors inherent in Cartographic data is critical to
ensuring that any spatial analysis performed using those datasets
meets a minimum threshold for accuracy.
 The saying, “Garbage in, garbage out” applies all too well when
data that is inaccurate, imprecise, or full of errors is used during
analysis.
Accuracy and Precision
 Accuracy and precision are both important
aspects of Cartographic data quality, but
they refer to different things.

 To understand the relevance of accuracy


and precision, we should start by getting the
difference between the terms:
Accuracy
 Accuracy can be defined as the degree or closeness
to which the information on a map matches the
values in the real world. Therefore, when we refer to
accuracy, we are talking about the quality of data
and about number of errors contained in a certain
dataset.

 In Cartographic data, accuracy can be referred to as


a geographic position, but it can also be referred to as
attribute, or conceptual accuracy.
Precision
 Precision refers to the level of measurement
and exactness of description in a GIS
database.

 Precise data may be inaccurate because it


may be exactly described but inaccurately
gathered. (Maybe the surveyor made a
mistake, or the data was recorded wrongly
into the database).
Sources of Inaccuracy
and Imprecision
 Scale, for example, is an inherent error in
cartography; depending on the scale used, we
can represent different types of data in a different
quantity and quality. Cartographers should adapt
the scale of work to the level of detail needed in
their projects.
 The age of data may be another obvious source of
error. When data sources are too old, some, or a
big part, of the information base may have
changed. Cartographers should always be mindful
of using old data and the lack of currency to that
data before using it for analysis.
Attribute Errors
A common mistake can happen with
label or attribute errors.

 For instance, an agricultural land may


be incorrectly marked as a Forest, and
this would cause an error that the map
user may not notice because they may
not be familiar with the area in
question.
Positional accuracy of GIS
data
 Cartographers can accurately locate certain features
like roads, boundary lines, etc. but other data with less
defined positions in space such as soil types, may be
just an approximate location based on the estimation
of the cartographer.

 Other features, like climate, for instance, lack defined


boundaries in nature and, therefore, are subject to
interpretation by a data producer/cartographer.
Topological errors
 Topological errors occur often during the
digitizing process.
Attribute Data
Types
Attribute Data Types

 The type of data that we employ to help


us understand a given entity is
determined by (1) what we are
examining, (2) what we want to know
about that entity, and (3) our ability to
measure that entity at the desired scale.
 The most common data types available
in a GIS are alphanumeric strings,
numbers, Boolean values, dates, and
binaries.
Attribute Data Types
 An alphanumeric string, or text, the data
type is any simple combination of letters
and numbers that may or may not form
coherent words.

 The character property (or string) is for text-


based values such as the name of a street
or descriptive values such as the condition
of a street.
Attribute Data Types
 The number data type can be
subcategorized as either floating-point or
integer.
 A floating point is any data value
containing decimal digits, while an integer
is any data value not containing decimal
digits.
 Integers can be short or long, depending on
the number of significant digits.
Attribute Data Types
 Boolean, date, and binary values are less
complex.

 Boolean values are simply those deemed true or


false based on applying a Boolean operator such
as AND, OR, and NOT.

 The date data type is self-explanatory, while the


binary data type represents attributes whose
values are either 1 or 0.
Measurement Scale
 In addition to defining data by type, a measurement
scale acts to group data according to the level of
complexity.

 For GIS analysis, measurement scales can be grouped


into two broad categories. Nominal and ordinal data
represent categorical data; interval and ratio data
represent numeric data.
Nominal Scale
 The most straightforward data measurement scale is
the nominal or named scale.
 The nominal scale makes statements about what to
call data points but does not allow for scalar
comparisons between one object and another.
 For example, attributing nominal information to
points representing cities will describe whether the
given city is “Gaborone” or “Francistown.” However,
no further denotations can be made about those
locales, such as population or voting history.
 Other examples of nominal data include last name,
eye color, land-use type, ethnicity, and gender.
Ordinal Scale
 Ordinal data places attribute information into
ranks and yields more precisely scaled
information than nominal data.

 Ordinal data describes the position in which


data occur, such as first, second, third, etc.
Ordinal Scale
 These scales may also take on names such as “very
unsatisfied,” “unsatisfied,” “satisfied,” and “very satisfied.”
Although this measurement scale indicates the ranking of
each data point relative to other data points, the ordinal scale
does not explicitly denote the exact quantitative difference
between these rankings.
 For example, if an ordinal attribute represents which runner
came in first, second, or third place, it does not state how long
the winning runner beat the second-place runner. Therefore,
one cannot undertake arithmetic operations with ordinal data.
The only sequence is explicit.
Interval Scale

 An interval data measurement scale


allows precise quantitative statements
about attributes.
 Interval data are measured along a scale
in which each position is equidistant.
 Elevation and temperature readings are
typical representations of interval data.
 For example, this scale can determine
that 30 degrees Celsius is 5 degrees
Celsius warmer than 25 degrees Celsius.
Interval Scale

 A notable property of the interval scale is


that zero is not a meaningful value
because zero does not represent
nothingness or the absence of a value.

 Indeed, 0 degrees Celsius does not


indicate that no temperature exists.
Similarly, an elevation of 0 meters does
not indicate a lack of elevation but
indicates the mean sea level.
Ratio Scale
 Ratio data are like the interval measurement
scale but based on a meaningful zero
value.

 Population density is an example of ratio


data whereby a 0-population density
indicates that no people live in an area.
Discrete and Continuous
Data
 Specific to numeric datasets, data values
also can be discrete or continuous.
 Discrete data maintain a finite number of
values, while infinite values can represent
continuous data.
 Continuous data represents a
measurement that can take on any value
within a range, while discrete data
represents a specific category or class.
Discrete and Continuous
Data
 For example, temperature is a continuous variable
because it can take on any value within a range,
while land use is a discrete variable because it is
made up of distinct, separate categories such as
forest, agriculture, or urban areas.
 Continuous data is often represented using a
continuous color scale or a gradient, which allows for
the visualization of patterns and trends across a range
of values.
 Discrete data, on the other hand, is typically
represented using a set of distinct colors or symbols
that correspond to each category or class.
Discrete Data
 Discrete data is geographic data that only occurs
in specific locations.

 Discrete GIS data can be represented using both


vector and raster data models. Examples of
discrete data include land use categories, soil
types, or vegetation classes.

 Maps made with discrete GIS data will have areas


on the map that contain values from that dataset
and areas on the map where that dataset is
absent.
Continuous Data
 Continuous data has no clearly defined
boundaries.

 Every point on a map made with continuous GIS


data will contain a numeric value.

 Continuous GIS data is represented by a


continuous scale, such as a gradient.

 Examples of continuous data include


temperature, elevation, slope, and rainfall.
Continuous Data
 In GIS, continuous data is often represented
by a raster data model, where data is stored
in a grid of cells, with each cell representing
a small area of the Earth’s surface.

 The values in each cell represent the value


of the continuous variable being measured
at that location.
Spatial Analysis Focus of
Discrete and Continuous
Data
 In GIS analysis, the type of data being used has
implications for the types of analysis and techniques
that can be applied.
 For example, continuous data is often used in terrain
analysis or environmental modeling to identify areas of
high or low elevation, slope, or other variables.
 Discrete data is often used in land use or demographic
analysis to identify patterns or clusters of different types
of features or populations.
 What is the age of the data?
 Where did it come from?
 In what medium was it originally produced?
 What is the area coverage of the data?
 To what map scale was the data digitized?
 What projection, coordinate system, and datum were used in the
data?
 What was the density of observations used for its compilation?
 How accurate are positional and attribute features?
 Does the data seem logical and consistent?
 Do cartographic representations look "clean?"
 Is the data relevant to the project at hand?
 In what format is the data kept?
 How was the data checked?
 Why was the data compiled?
 What is the reliability of the provider?

You might also like