0% found this document useful (0 votes)
22 views7 pages

Lecture 8 - Accuracy in GIS

The document discusses the inherent inaccuracies and errors in GIS data, emphasizing the importance of understanding accuracy and precision for effective spatial analysis. It outlines various sources of errors, including data age, formatting issues, and digitizing mistakes, while highlighting the need for data quality reports and metadata to mitigate these issues. The document also details types of digitizing errors, such as dangling nodes and slivers, which can arise during the conversion of geographic data into vector format.

Uploaded by

kevino.powers
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views7 pages

Lecture 8 - Accuracy in GIS

The document discusses the inherent inaccuracies and errors in GIS data, emphasizing the importance of understanding accuracy and precision for effective spatial analysis. It outlines various sources of errors, including data age, formatting issues, and digitizing mistakes, while highlighting the need for data quality reports and metadata to mitigate these issues. The document also details types of digitizing errors, such as dangling nodes and slivers, which can arise during the conversion of geographic data into vector format.

Uploaded by

kevino.powers
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

GIS Data: A Look at Accuracy, Precision, and Types

of Errors
There is no such thing as the perfect GIS data. It is a fact in any science, and
cartography is no exception. However, the imperfection of data and its effects
on GIS analysis had not been considered in great detail until recent years. In the
last decade, GIS specialists started to accept that error, inaccuracy, and
imprecision can affect the quality of many types of GIS projects, in the sense
that errors that are not accounted for can turn the analysis in a GIS project to a
useless exercise. Understanding error inherent in GIS data is critical to ensuring
that any spatial analysis performed using those datasets meets a minimum
threshold for accuracy. The saying, “Garbage in, garbage out” applies all to well
when data that is inaccurate, imprecise, or full of errors is used during analysis.

The power of GIS resides in its ability to use many types of data related to the
same geographical area to perform the analysis, integrating different datasets
within a single system. But when a new dataset is brought to the GIS, the
software imports not only the data, but also the error that the data contains.
The first action to take care of the problem of error is being aware of it and
understanding the limitations of the data being used.

Accuracy and Precision


In order to really understand the relevance of accuracy and precision, we should
start getting the difference between both terms:

Accuracy can be defined as the degree or closeness to which the information on


a map matches the values in the real world. Therefore, when we refer to
accuracy, we are talking about quality of data and about number of errors
contained in a certain dataset. In GIS data, accuracy can be referred to a
geographic position, but it can be referred also to attribute, or conceptual
accuracy.

Precision refers how exact is the description of data. Precise data may be
inaccurate, because it may be exactly described but inaccurately gathered.
(Maybe the surveyor made a mistake, or the data was recorded wrongly into the
database).
PRECISION VERSUS ACCURACY
In the series of images above, the concept of precision versus accuracy is
visualized. The crosshair of each image represents the true value of the entity
and the red dots represent the measure values. Image A is precise and
accurate, image B is precise but not accurate, image C is accurate but imprecise,
Image D is neither accurate nor precise. Understanding both accuracy and
precision is important for assessing the usability of a GIS dataset. When a
dataset is inaccurate but highly precise, corrective measures can be taken to
adjust the dataset to make it more accurate.

Error involves assessing both the imprecision of data and its inaccuracies.

Sources of Inaccuracy and Imprecision


Some sources of error in GIS data are very obvious, whereas others are more
difficult to notice. GIS software can make the users to think that their data is
accurate and precise to a degree that is not quite real. Scale, for example, is an
inherent error in cartography; depending on the scale used, we will be able to
represent different type of data, in a different quantity and with a different
quality. Cartographers should always adapt the scale of work to the level of
detail needed in their projects.

The age of data may be another obvious source of error. When data sources are
too old, some, or a big part, of the information base may have changed. GIS
users should always be mindful when using old data and the lack of currency to
that data before using it for contemporary analysis.

There are some types of errors created when formatting data for processing.
Changes in scale, reprojections, import/export from raster to vector, etc. are all
examples of possible sources of formatting errors.

Other sources of error may not be so obvious, some of them originated at the
moment of initial measurements, even from the moment of capturing the data
cause by users.

Quite often we can identify quantitative and qualitative errors. A common


mistake consists on label errors. For instance, an agricultural land may be
incorrectly marked as a marsh, and this would cause an error that the map user
may not notice because he may not be familiar with the area in question.
Quantitative errors may occur also when using instrument that have not been
properly calibrated creating subsequent errors hard to identify in the field, but
that will cause your project to lose accuracy and reliability.

We also have to pay attention to what has been defined as positional accuracy,
whichis dependent on the type of data. Cartographers can accurately locate
certain features like roads, boundary lines, etc. but other data with less defined
position in space such as soil types, may be just an approximate location based
on the estimation of the cartographer. Other features, like climate, for instance
lack defined boundaries in nature and, therefore, are subject to subjective
interpretation.

Topological errors occur often during the digitizing process. Errors of the
operator may result in polygon knots, and loops, and there may be some errors
associated with damaged source maps as well.
EXAMPLES OF TOPOLOGICAL ERRORS IN GIS. SOURCE: TONY ROTONDAS.

Errors can be intentionally introduced in GIS data. Most commonly,


generalization which is used to reduce the amount of detail in a dataset,
introduces error by removing aspects of a feature.

Another intentional introduction of error is the trademarking sometimes found


within datasets by commercial GIS vendors. For example, a GIS data vendor
may insert false streets or fake street names into a dataset.

We can never forget that inaccuracy, imprecision, and the resulting error, may
be compounded in a GIS project when we need to employ more than one data
source. In these types of projects, one error leads to another, compounding its
effects on the analysis and affecting the entire project. For that reason, it
becomes clear that the best way to avoid the dangers of propagation of errors
would be to always prepare a data quality report for data created by the GIS
users, even if they don’t plan to share the data with others. The use
of metadata, (or data about the data), is one of the first tools that any GIS user
should consult in order to know more about the data that he is using and to
avoid adding more error to a data that in any case will never be perfect. Any
good metadata should always include some basic information like age of the
data, origin, area that it covers, scale, projection system, accuracy, format, etc.

Digitizing Errors in GIS


Digitizing in GIS is the process of converting geographic data either from a
hardcopy or a scanned image into vector data by tracing the features. During
the digitzing process, features from the traced map or image are captured as
coordinates in either point, line, or polygon format.

Types of Digitizing in GIS

There are several types of digitizing methods. Manual digitizing involves tracing
geographic features from an external digitizing tablet using a puck (a type of
mouse specialized for tracing and capturing geographic features from the
tablet). Heads up digitizing (also referred to as on-screen digitizing) is the
method of tracing geographic features from another dataset (usually an aerial,
satellite image, or scanned image of a map) directly on the computer screen.
Automated digitizing involves using image processing software that contains
pattern recognition technology to generated vectors. More detail about
creating geographic data can be found in this article: Methods for Creating
Spatial Databases.

Types of Digitizing Errors in GIS

Since most common methods of digitizing involve the interpretation of


geographic features via the human hand, there are several types of errors that
can occur during the course of capturing the data. The type of error that occurs
when the feature is not captured properly is called a positional error, as
opposed to attribute errors where information about the feature capture is
inaccurate or false. These positional error types are outlined below, and a
visualization of the different methods is shown at the bottom of this section.

During the digitizing process, vectors are connected to other lines by a node,
which marks the point of intersection. Vertices are defining points along the
shape of an unbroken line. All lines have a starting point known as a starting
node and an ending node. If the line is not a straight line, then any bends and
curves on that line are defined by vertices (vertex for a singular bend). Any
intersection of two lines is denoted by node at the point of the intersection.

Dangles or Dangling Nodes

Dangles or dangling nodes are lines that are not connected but should be. With
dangling nodes, gaps occur in the line work where the two lines should be
connected. Dangling nodes also occur when a digitized polygon doesn’t
connect back to itself, leaving a gap where the two end nodes should have
connected, creating what is called an open polygon.

Switchbacks, Knots, and Loops


These types of errors are introduced when the digitizer has an unsteady hand
and moves the cursor or puck in such a way that the line being digitized ends up
with extra vertices and/or nodes. In the case of switchbacks, extra vertices are
introduced and the line ends up with a bend in it. With knots and loops, the line
folds back onto itself, creating small polygon like geometry known as weird
polygons.

Overshoots and Undershoots


Similar to dangles, overshoots and undershoots happen when the line digitized
doesn’t connect properly with the neighboring line it should intersect with.
During digitization a snap tolerance is set by the digitizer. The snap tolerance or
snap distance is the measurement of the diameter extending from the point of
the cursor. Any nodes of neighboring lines that fall within the circle of the snap
tolerance will result in the end points of the line being digitized automatically
snapping to the nearest node. Undershoots and overshoots occur when the
snap distance is either not set or is set too low for the scale being digitized.
Conversely, if the snap distance is set too high and the line endpoint snaps to
the wrong node. In a few cases, undershoots and overshoots are not actually
errors. One instance would be the presence of cul-de-sacs (i.e. dead ends)
within a road GIS database.

Slivers
Slivers are gaps in a digitized polygon layer where the adjoining polygons have
gaps between them. Again, setting the proper parameters for snap tolerance is
critical for ensuring that the edges of adjoining polygons snap together to
eliminate those gaps. Where the two adjacent polygons overlap in error, the
area where the two polygons overlap is called a sliver.

You might also like