0% found this document useful (0 votes)
156 views

Unit 3 GIS Data Sources and Structures 1

Uploaded by

YonesH gurUng
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
156 views

Unit 3 GIS Data Sources and Structures 1

Uploaded by

YonesH gurUng
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Unit-3

GIS Data Sources and Structures


Data: the fuel
The geographic data is information about the earth’s surface and the objects found on it. Data
is fuel to GIS. How can we feed data like map in a GIS? Data capture is a process of putting
information into the system. A wide variety of sources can be used for creating geographic
data, which is discussed below.

Types and Sources of Geographic Data


Geographic data are generally available in two forms: analogue data and digital data. Analogue
data is a physical product displaying information visually on paper, e.g. maps. Digital data is
information on computer readable form, e.g. satellite data.
There are various sources from where we can get these different types of data. For example,
as shown in the figure 3.2, the sources are – maps, aerial photo, satellite images, existing
tabular data (in analogue and digital format), and field data (GPS). GIS is able to capture these
different types of data from various sources. Creating a database, i.e. capturing the data, is the
initial stage and time consuming task of a GIS project.

Figure 3.1 Analogue and digital data


Figure 3.2 Data sources

Private Suppliers (Commercial Data)


There are many private sources of information. Commercial mapmaking firms are among the
largest providers, but other firms have for years supplied detailed demographic and economic
information, such as data on retail trade and marketing trends. Some of this information can
be quite expensive to purchase. Also, it is important to check on restrictions that might apply
to the use of commercially provided data. In some cases, copyright and licensing restrictions
may apply to your intended use and publication of the information. Many software vendors
earn a substantial income by repackaging and selling data in the proprietary forms used by
their software products. Because the data is usually checked and corrected as it is repackaged,
the use of these converted datasets can save time. The widespread expansion of this marketing
and re- marketing of data has been a boon to many users who do not wish to be invest resources
in building the datasets they need on a day-today basis--they simply buy what they need.

1
It is important for us to consider the following question about GIS data sources:
• What is the age of the data?
• Where did it come from?
• In what medium was it originally produced?
• What is the area coverage of the data?
• To what map scale was the data digitized?
• What projection, coordinate system, and datum were used in maps?
• What was the density of observations used for its compilation?
• How accurate are positional and attribute features?
• Does the data seem logical and consistent?
• Do cartographic representations look "clean?"
• Is the data relevant to the project at hand?
• In what format is the data kept?
• How was the data checked?
• Why was the data compiled?
• What is the reliability of the provider?

Data Capturing Methods


The different data capturing methods from various sources commonly used in a GIS are briefly
discussed below (see figure 3.3).

Photogrammetric Compilation

The primary source used in the process of photogrammetric compilation is aerial photography.
Generally, the process involves using specialized equipment (a stereoplotter) to project
overlapping aerial photos so that a viewer can see a three-dimensional picture of the terrain,
known as a photogrammetric model. The current technological trend in photogrammetry is
toward a greater use of digital procedures for map compilation.

Figure 3.3a Aerial Photography

Digitizing

A digitizing workstation with a digitizing tablet and cursor is typically used to trace digitize.
Both the tablet and cursor are connected to a computer that controls their functions. Most
digitizing tablets come in standard sizes that relate to engineering drawing sizes ("A" through
"E," and larger). Digitizing involves tracing features on a source map, taped to the digitizing

2
tablet, with a precise cross hair in the digitizing cursor and instructing the computer to accept
the location and type of feature. The person performing the digitizing may separate features
into map layers, or attach an attribute to identify the feature.

Figure 3.3b Digitiiser

Map Scanning

Optical scanning systems automatically capture map features, text, and symbols as individual
cells, or pixels, and produce an automated product in raster format as described earlier .
Scanning outputs files in raster form, usually in one of several compressed formats saves
storage space (e.g., TIFF 4, JPEG). Most scanning systems provide software to convert raster
data to a vector format differentiating point, line, and area features. Scanning systems and
software is becoming more sophisticated with some abilities to interpret symbols and text, and
store this information in databases. Creating an intelligent GIS database from a scanned map
will require vectorizing the raster data and manual time for entering attribute data from a
scanned annotation.

Figure 3.3c Map Scanner

Satellite Data

Earth Resources Satellites have become a source of huge amount of data for GIS applications.
The data obtained from the Satellites are in digital form, which can be directly imported to
GIS. There are numerous satellite data sources such as LANDSAT or SPOT. A new generation
of high-resolution satellite data that will increase opportunities and options for GIS database

3
development is becoming available from private sources and national governments. These
satellite systems will provide panchromatic (black and white) or multi-spectral data in the 1-
to 3-meter ranges as compared to the 10- to 30-meter range available from traditional remote
sensing satellites.

Figure 3.3d Satellite data

Field Data Collection

Advances in hardware and software have greatly increased opportunities for capture of GIS
data in the field (e.g., sign of utility inventory, property surveys, land use inventories). In
particular, electronic survey systems and the global positional system (GPS) have
revolutionized surveying and field data collection. Electronic distance measurement services
allow for survey data to be gathered quickly in automated form for uploading to a GIS.
Sophisticated GPS collection units have provided a quick means of capturing the coordinates
and attributes of features in the field.

Figure 3.3e GPS

Tabular Data Entry

Some of the tabular attribute data that is normally in a GIS database exists on maps as
annotation and or can be found in paper files. Information from these sources will be required
for GIS applications and will have to be converted to digital form through keyboard entry. This
kind of data entry is commonplace and relatively easy to accomplish.

Document Scanning

Smaller-format scanners can also be used to create raster files of documents such as permit
forms, service cards, site photographs, etc. These documents can be indexed in a relational
database by number, type, date, engineering drawings, etc., and queried and displayed by users.
4
GIS applications can be built which allows users to point to and retrieve for display a scanned
document (e.g., tax parcel) interactivley.

Translation of Existing Digital Data

Existing automated data may be available from existing tabular files maintained by outside
sources. Many programs are available that perform this translation and, in fact, many GIS
packages can be acquired with programs that translate data to and from several "standard"
formats which are accepted widely by the mapping industry and have been used as intermediate
"exchange" formats for moving data between platforms (e.g., Intergraph SIF, TIGER,
Shapefile and AutoCAD DXF)

Spatial Referencing
In the early days of GIS, users were handling spatially referenced data from a single country.
The data was derived from paper maps published by the countries mapping organization.
Nowadays, GIS users are combining spatial data from a certain country with global spatial data
sets, reconciling spatial data from a published map with coordinates established with satellite
positioning techniques and integrating spatial data from neighboring countries. To perform
these tasks successfully, GIS users need a certain level of appreciation for a few basic spatial
referencing concepts pertinent to published maps and spatial data.
Geographic referencing, which is sometimes simply called georeferencing, is defined as the
representation of the location of real-world features within the spatial framework of a particular
coordinate system. The objective of georeferencing is to provide a rigid spatial framework by
which the position of the real-world features are measured, computed, recorded, and analyzed.
In practice georeferencing can be seen as series of concepts and techniques that progressively
transform measurements carried out on the irregular surface of a map, and make it easily and
readily measurable on this flat surface by means of a coordinate system. The concept of
representing the physical shape of earth by means of a mathematical surface and the realization
of this concept by the definitions of the geoid and ellipsoid are fundamental to georeferencing.
Spatial reference system and frames

The geometry and motion of objects in 3D Euclidean space are described in a reference
coordinate system. A reference coordinate system is a coordinate system with well-defined
origin and orientation of the three orthogonal, coordinate axes. We shall refer to such a system
as a spatial reference system (SRS). A spatial reference system is a mathematical abstraction.
It is realized by means of spatial reference frame (SRF). We may visualize an SRF as a
catalogue of coordinates of specific, identifiable point objects, which implicitly materialize the
coordinate axes of SRS.
Several spatial reference systems are used in the earth sciences. The most important one for
the GIS community is the International Terrestrial Reference System (ITRS). The ITRS has
its origin in the center of mass of the earth. The Z-axis points towards a mean earth north pole.
The X-axis is oriented towards a mean Greenwich meridian and is orthogonal to the Zaxis. The
Y-axis completes the right handed reference coordinate system.

5
(a) The ITRS and (b) The ITRF visualized as the fundamental
polyhedron

Introduction
Today it is common to determine a point’s position using Global Navigation Satellite Systems
(GNSS). If GNSS - “GPS” is used then the point’s position is determined in the reference
system ‘WGS 84’. Observing in a good GNSS environment, the absolute accuracy for a ‘single
point position fix’ will be ± 5 - 10 metres in the horizontal – ie 2 dimensions at the 2 sigma
(2σ) confidence level. It is however possible to increase the accuracy of point positioning but
positional services such as ‘Fugro Omnistar’ are needed OR post-processing using precise
orbits is usually necessary. If higher accuracy is required (mm to cm) then GNSS data from
points of ‘known position’ in the region are needed. The resulting co-ordinates for the point
will then be in the same reference frame as the local point. This local point could be a
permanent GNSS station in continuously operating reference station (CORS) network that is
linked to an International Terrestrial Reference Frame (ITRF).
In a GNSS CORS network the surveyor will normally derive a height based on the reference
ellipsoid ie Geodetic Reference System 1980 (GRS80). Most users however are working with
'physical' heights based on a local height datum (ie local mean sea level) and thus need to relate
the derived ellipsoid height to this local height datum. This is achieved by using a geoid model
for the subject survey area.
From a spatial information perspective, it is common for spatial datasets and geographical
information data to extend over national or regional boundaries. In this situation it is needed
to have a common reference frame for the collection, storage, visualisation and exchanging of
the information. ITRF is the most accurate reference frame that exists internationally and
consequently more countries are using a national solution based on ITRF.

6
What is ITRS and ITRF?
Co-ordinates in an International Terrestrial Reference System (ITRS) are computed at different
epochs and the solutions are called ITRF. Due to plate tectonics and tidal deformation, the co-
ordinates changes for a certain point between the different ITRF. The latest version of ITRF is
ITRF 2005. In simple terms the ITRF is a realisation of the ITRS
What is the difference between ITRF based datums and WGS84 coordinates?
WGS84 or the World Geodetic System 1984 is the geodetic reference system used by the
GNSS - “GPS”. WGS84 was developed for the United States Defence Mapping Agency
(DMA), now called NGA (National Geospatial - Intelligence Agency). Although the name
WGS84 has remained the same, it has been enhanced on several occasions to a point where it
is now very closely aligned to ITRF and referenced as WGS 84 (G1150). The origin of the
WGS84 framework is also the earth’s centre of mass. For all practical purposes, an ITRF based
geodetic datum or CORS network and WGS84 are the same. The difference is of the order of
cms.
What are ITRF Co-ordinates?
ITRF co-ordinates or positions are articulated as three dimensional geocentric or Earth Centred
Cartesian coordinates ie “X, Y and Z”. To convert these Cartesian co-ordinates to geographic
co-ordinates (latitudes and longitudes and ellipsoid height) the GRS80 ellipsoid is normally
used as it is the best fitting scientific and mathematical global figure or model for the earth’s
surface.
Note - in some cases it is necessary to describe an ITRF position in plane (grid) co-ordinates
(eg two dimensions – eastings and northings) hence a mathematical map projection is used. A
popular map projection which retains the angle is the Transverse Mercator projection.
What are the benefits of a geodetic datum based on ITRF?
Adopting an ITRF based geodetic datum allows for a single standard for collecting, storing
and using geographic or survey related data. This will ensure compatibility across various
geographic, land and survey systems at the local, regional, national and global level. This is
the main reason that the ITRF based CORS networks should form the basis for Spatial Data
Infrastructure (SDI) which is the enabling infrastructure to manage a country’s key spatial data
sets ie it underpins or is the reference layer for the cadastre, transit / road networks,
infrastructure corridors like gas, water, power, communications etc. An ITRF based geocentric
datum or CORS network will also:
• provide direct compatibility with GNSS measurements and mapping or geographic
information system (GIS) which are also normally based on an ITRF based geodetic datum;
• minimise the need for casual users to understand datum transformations;
• allow more efficient use of an organisations’ spatial data resources by reducing need for
duplication and unnecessary translations;
• help promote wider use of spatial data through one user friendly data environment;
• reduce the risk of confusion as GNSS, GIS and navigation systems become more widely used
and integrated into business and recreational activities.

7
What is GPS:
Where do I stand?

Knowing where you are and where you are going was the most crucial and challenging task
faced by the explorers since ancient ages. Positioning and navigation are very important in
many activities and many tools and techniques have been adopted for this purpose. People
have used magnetic compass, sextant, theodolite and measured the positions of sun, moon and
stars to find out his own position. Today, the Global Positioning System (GPS) has been
developed by the US Department of Defence (DoD) for world wide positioning, at the cost of
12 billion Dollars.
GPS is a worldwide radio-navigation system formed from a constellation of 24 satellites and
their ground stations. It uses these "man-made stars" as reference points to calculate positions
accurate to a matter of meters. GPS receivers have become very economical, making the
technology accessible to virtually everyone. GPS provides continuous three-dimensional
positioning 24 hours a day to the military and civilian users throughout the world. These days
GPS is finding its way into cars, boats, planes, construction equipment, farm machinery, even
laptop computers. It has a tremendous amount of applications in GIS data collection,
surveying, and mapping. GPS is increasingly used for precise positioning of geospatial data
and the collection of data in the field.

Components of the GPS


The Global Positioning System is divided into three major components: the control segment,
the space segment, and the user segment. All three of these segments are required to perform
positional determination.

CONTROL SEGMENT
The Control Segment consists of five monitoring stations - Colorado Springs, Ascension
Island, Diego Garcia, Hawaii, and Kwajalein Island (figure 3.4). Colorado Springs serves as
the master control station. The Control Segment is the sole responsibility of the DoD who
undertakes construction, launching, maintenance, and virtually constant performance
monitoring of all GPS satellites. The monitoring stations track all GPS signals for use in
controlling the satellites and predicting their orbits.

Figure 3.4: Control segment

8
SPACE SEGMENT
The Space Segment consists of the constellation of earth orbiting satellites. The satellites are
arrayed in 6 orbital planes, inclined 55 degrees to the equator (figure 3.5). They orbit at
altitudes of about 12,000 miles each. Each satellite contains four precise atomic clocks
(Rubidium and Cesium standards) and has a microprocessor on board for limited self-
monitoring and data processing. The satellites are equipped with thrusters, which can be used
to maintain or modify their orbits.

FIGURE 3.5 SPACE SEGMENT

USER SEGMENT
The User Segment consists of all earth-based GPS receivers (figure 3.6). Receivers vary
greatly in size and complexity, though the basic design is rather simple. The typical receiver
is composed of an antenna and preamplifier, radio signal microprocessor, control and display
device, data recording unit, and power supply (figure 3.6). The GPS receiver decodes the
timing signals from the ‘visible’ satellites (four or more) and, having calculated their distances,
computes its own latitude, longitude, elevation, and time. This is a continuous process and
generally the position is updated on a second-by-second basis, output to the receiver display
device and, if the receiver provides data capture capabilities, stored by the receiver logging
unit.

Figure 3.6 GPS receiver

9
HOW GPS WORKS?
The GPS uses satellites and computers to compute positions anywhere on earth. The GPS is
based on satellite ranging. That means the position on the earth is determined by measuring
the distance from a group of satellites in space. Triangulation from the satellite is the basis of
the system. To triangulate, the GPS measures the distance using the travel time of a radio
message, for which it needs a very accurate clock. Once the distance to a satellite is known,
then we need to know where the satellite is in space.
To compute a position in three dimensions, we need to have four satellite measurements. The
GPS uses a trigonometric approach to calculate the positions (figure 3.7). The GPS satellites
are so high up that their orbits are very predictable and each of the satellites is equipped with
a very accurate atomic clock.

Figure 3.7 GPS triangulation

GPS errors

Although the GPS looks like a perfect system, there are a number of sources of errors which
are difficult to eliminate (figure 3.8). The ultimate accuracy of GPS is determined by sum of
these several sources of error.

Figure 3.8 GPS Errors

10
SATELLITE ERRORS
Slight inaccuracies in time keeping by the satellites can cause errors in calculating our
positions. Similarly, the satellite’s position in space is equally important as it is the starting
point of the calculations. Although the GPS satellites are at very high orbits and are relatively
free from the perturbing effects of atmosphere, they still drift slightly from their predicted
orbits which contributes to our errors.

THE ATMOSPHERE
The GPS signals have to travel through charged particles and water vapour in the atmosphere
which delays its transmission. Since the atmosphere varies at different places and at different
times, it is not possible to accurately compensate for the delays that occur.

Multipath error
As the GPS signal finally arrives at the earth’s surface, it may be reflected by local obstructions
before it gets to the receiver’s antenna. This is called multipath error as the signal is reaching
the antenna by multiple paths.

RECEIVER ERROR
Since the receivers are also not perfect, they can introduce their own errors which usually occur
from their clocks or internal noise.

SELECTIVE AVAILABILITY
Selective availability (SA) was the intentional error introduced by DoD to make sure that no
hostile forces used the accuracy of GPS against the US or its allies. It introduced some noise
into the GPS satellite clocks which reduced their accuracy. The satellites were also given some
erroneous orbital data which was transmitted as a part of each satellite’s status message. These
two factors significantly reduced the accuracy of GPS in civilian uses.
On May 1st, 2000, the White House announced a decision to discontinue the intentional
degradation of the GPS signals to the public. Civilian users of GPS will be able to pinpoint
locations up to ten times more accurately. The decision to discontinue SA is the latest measure
in an on-going effort to make GPS more responsive to civil and commercial users worldwide.

Differential positioning
To eliminate most of the errors discussed above, the technique of differential positioning is
applied. Differential GPS carries the triangulation principle one step further, with a second
receiver at a known reference point. The reference station is placed on the control point - a
triangulated position or the control point coordinate. This allows for a correction factor to be
calculated and applied to other roving GPS units used in the same area and in the same time
series. This error correction allows for a considerable amount of error to be negated, potentially
as much as 90 per cent. The error correction can either be post processed or on real time (figure
3.9).

11
Figure 3.9 Differential positioning

Integration of GPS and GIS


It is possible to integrate GPS positioning in GIS for filed data collection. GPSs are also used
in remote-sensing methods such as photogrammetry, aerial scanning, and video technology.
GPS are becoming very effective tools for GIS data capture. The GIS user community benefits
from the use of GPS for locational data capture in various GIS applications. The GPS can
easily be linked to a laptop computer in the field, and, with appropriate software, users can
also have all their data on a common base with very little distortion. Thus GPS can help in
several aspects of construction of accurate and timely GIS databases

Some Applications of GPS:


 Fishing

 Hiking

 Sailing/Boating

 Automobile

 Cell Phones

 Pilots

 Biking

 Education

Remote Sensing
Remote Sensing satellite images gives a synoptic (bird’s eye) view of any places of the Earth
surface, which helps to study, map, and monitor the Earth’s surface at local and/or
regional/global scales. It is cost effective and gives better spatial coverage as compared to
ground sampling.
Generally, Remote Sensing refers to the activities of recording/observing/perceiving (sensing)
objects or events at far away (remote) places.

12
Remote Sensing is defined as the science and technology by which the characteristics of
objects of interest can be identified, measured or analyzed the characteristics without direct
contact. Remote Sensing deals with gathering information about the Earth from a distance.
This can be done from a few metres off the Earth’s surface, an aircraft flying hundreds
thousands of metres above the surface, or a satellite orbiting hundreds of kilometers above the
Earth.

Figure 3.10 Earth from Space


Remote-sensing satellite

The remote sensing satellites are equipped with sensors looking down to the earth. They are
the "eyes in the sky" constantly observing the earth as they move around the earth (figure 3.11).

Figure 3.11 Remote Sensing satellite

How does remote sensing work?

Electro-magnetic radiation which is reflected or emitted from an object is the usual source of
remote sensing data. A device to detect the electro-magnetic radiation reflected or emitted from
an object is called a "remote sensor" or "sensor". Cameras or scanners are examples of remote
sensors. A vehicle to carry the sensor is called a "platform". Aircraft or satellites are
used as platforms.
The characteristics of an object can be determined, using reflected or emitted electro-magnetic
radiation, from the object. That is, "each object has a unique and different characteristics of
reflection or emission if the type of object or the environmental condition is different. "Remote

13
sensing is a technology to identify and understand the object or the environmental condition
through the uniqueness of the reflection or emission. This concept is illustrated in figure 3.12.

Figure 3.12 Remote Sensing


Types of remote-sensing images
Presently there are several remote sensing satellite series in operation. Different satellite
systems have different characteristics, e.g. resolutions, number of bands, and have their own
importance for different application. Some major satellite systems and their major
characteristics are given below:

14
Satellite Systems Spatial Type Number Launched by
Resolution of
Bands
LANDSAT-TM 30m Multi-spectral 7 USA

LANDSAT-MSS 80m Multi-spectral 4 USA

SPOT-XS 20m Multi-spectral 3 France

SPOT-PAN 10m Panchromatic 1 France

IRS-1C PAN 6m Panchromatic 1 India

LISS-III 24m Multi-spectral 4 India

WiFS 188m Multi-spectral 2 India

SPIN-2 2m Panchromatic 1 USA/Russia

IKONOS 1m Panchromatic 1 Canada

IKONOS 4m Multi-spectral 4 Canada

ADEOS-AVNIR 16m Multi-spectral 4 Japan


M
1.1Km Multi-spectral 5 USA
NOAA
50m Multi-spectral 4 USA
MOS

Remote-sensing images
Remote sensing images are normally in the form of digital images (figure 3.14). In order to
extract useful information from the images, image processing techniques are applied to
enhance the image to help visual interpretation, and to correct or restore the image if the image
has been subjected to geometric distortion, blurring or degradation by other factors. There are
many image analysis techniques available and the methods used depending upon the
requirements of the specific problem concerned.

15
Figure 3.14 Satellite Image of Kathmandu

Use of Remote Sensing Data in GIS


Remote sensing data after can be integrated with various other geographic data. There has been
an increasing trend in integration of remote sensing data into GIS for analytical purpose. There
many ways we could use remote sensing data and some examples are illustrated as below:
Land cover maps or vegetation maps classified from remote sensing data can be overlaid onto
other geographic data, which enables analysis for environmental monitoring and its
change.
Image data are sometimes also used as image maps, with an overlay of political boundaries,
roads, rivers etc. Such an image map can be successfully used for visual interpretation (figure
3.15 and 3.16).

Figure 3.15 Kathmandu urban area observed from an ADEOS-AVNIR M Japanese satellite
image, 1997, and overlaid with road and river features

16
Figure 3.16-D perspective of the Kathmandu valley generated by draping a LANDSAT-TM,
1988, satellite image over a DEM

Importance:

 Large amounts of data needed, and Remote Sensing can provide it

 Reduces manual field work dramatically

 Allows retrieval of data for regions difficult or impossible to reach:

 Open ocean

 Hazardous terrain (high mountains, extreme weather areas, etc.)

 Ocean depths

 Atmosphere
 Allows for the collection of much more data in a shorter amount of time

 Leads to increased land coverage AND

 Increase ground resolution of a GIS

 Digital Imagery greatly enhances a GIS

 DIRECTLY: Imagery can serve as a visual aid

 INDIRECTLY: Can serves as a source to derive information such as…

 Land use/land cover

 Atmospheric emissions

 Vegetation

17
 Water bodies

 Cloud cover

 Change detection (including sea ice, coastlines, sea levels, etc.)

Extracting RS Data

 Layers such as roads (yellow) and rivers (blue) can be easily seen from

air/satellite photos

 This information is digitized, separated into layers, and integrated into a GIS

Data Digitizing Process:


 MANUAL

 Map is fixed to digitizer table

 Control Points are digitized

 Feature Boundaries are digitized in stream or point mode

 The layer is proofed and edited

 The layer is transformed/registered to a known system

18
 AUTOMATED SCANNERS

 Digitizing done automatically by a scanner

 There is a range of scanner qualities

 Most utilize the reflection/transmission of light to record data

 “Thresholding” allows for the determination of both line and point features

from a hardcopy map  Editing still required

 DIRECT DATA ENTRY

 Coordinate Geometry is used, with GPS playing a vital role

 This involves directly entering in coordinates measured in the field

 These coordinates can then be tagged with attribute data

 This data this then downloaded to a computer and incorporated into a GIS

Data Preparation
Spatial data preparation aims to make the acquired spatial data fit for use. Images may require
enhancements and corrections of the classification scheme of the data. Vector data also may
require editing, such as the trimming of overshoots of lines at interactions, deleting duplicate
lines, closing gaps in lines, and generating polygons. Data may need to be converted to either
vector format or raster format to match other data sets. Additionally, the process includes
associating attribute data with the spatial data through either manual input or reading digital
attribute files into the GIS/DBMS.

The intended use of the acquired spatial data, furthermore, may require thinning the data set
and retaining only the features needed. The reason may be that not all features are relevant
for subsequent analysis or subsequent map production. In this case, data and/or cartographic
generalization must be performed to restrict the original data set.

Data precision, error and repair


Acquired data sets must be checked for consistency and completeness. This requirement
applies to the geometric and topological quality as well as the semantic quality of the data.
There are different approaches to clean up data. Errors can be identified automatically, after
which manual editing methods can be applied to correct the errors. Alternatively, a system may
identify and automatically correct many errors. Alternatively, a system may identify and
automatically correct many errors. Clean-up operations are often performed in a standard
sequence. For example, crossing lines are split before dangling lines are erased, and nodes are
created at intersections before polygons are generated.

19
Precision refers to the level of measurement and exactness of description in a GIS database.
Precise location data may measure position to a fraction of a unit. Precise attribute information
may specify the characteristics of features in great detail. It is important to realize, however,
that precise data--no matter how carefully measured-may be inaccurate. Surveyors may make
mistakes or data may be entered into the database incorrectly.

• The level of precision required for particular applications varies greatly. Engineering
projects such as road and utility construction require very precise information measured
to the millimeter or tenth of an inch.

• Highly precise data can be very difficult and costly to collect. Carefully surveyed
locations needed by utility companies to record the locations of pumps, wires, pipes
and transformers cost $5-20 per point to collect

Error encompasses both the imprecision of data and its inaccuracies

Multiple data sources

A GIS project usually involves multiple data sets, so a next step addresses the issue of how
these multiple sets relate to each other. There are three fundamental cases to be considered if
we compare data sets pair wise:

They may be about the same area, but differ in accuracy,

They may be about the same area, but differ in choice of representation, and

They may be about adjacent areas, and have to be merged into a single data set.

Differences in accuracy

Images come at a certain resolution, and paper maps at certain scale. This typically results in
differences of resolution of acquired data sets, all the more since map features are sometimes
intentionally displaced to improve the map. For instance, the course of a river will only be
approximated roughly on a small scale map, and a village on its northern bank should be
depicted north of the river, even if this means it has to be displaced on the map a little bit. The
small scale causes an accuracy error. If we want to combine a digitized version of that map,

20
with a digitized version of a large-scale map, we must be aware that features may not be where
they seem to be. Analogous examples can be given for images at different resolutions.

In the figure above, the polygons of two digitized maps at different scales are overlaid. Due to
scale differences in the sources, the resulting polygons do not perfectly coincide, and polygon
boundaries cross each other. This causes small, artifact polygons in the overlay known as silver
polygon.

Differences in representation

There exist more advanced GIS applications that require the possibility of representing the
same geographic phenomenon in different ways. Map production at various map scale is again
an example but there are numerous others. The commonality is that phenomena must
sometimes be viewed as points, and at other times as polygons, for instance. The complexity
that this requirement entails is that the GIS or the DBMS must keep track of links between
different representations for the same phenomenon and must also provide support for decisions
as to which representations to use in which situation.

21
For example, a small-scale national road network analysis may represent villages as point
objects, but a nation wide urban population density study should regard all municipalities as
represented by polygons. The links between various representations for the same things
maintained by the system allows interactive traversal, and many fancy applications of their use
seem possible. The systems that support this type of data traversal are called multi
representation systems.

Data Transformation

In virtually all mapping applications it becomes necessary to convert from one cartographic
data structure to another. The ability to perform these object-to-object transformations often is
the single most critical determinant of a mapping system's flexibility.

Format Change: Raster to vector and vector to raster conversion within the same GIS system.
May also include raster to vector and vector to raster data.

Issues to consider:

Loss of detail: especially at features edges, generally vector data more


accurately represents a feature

Loss of attribute data: some raster formats do not allow for multiple attributes
per cell

Vector and raster formats store similar GIS data in very different ways.

A particular GIS will adopt one of two strategies for dealing with two types of data. Some
systems use only one format exclusively and provide utilities or import options to bring in the
data and convert it to the needed format.

Other GIS software supports the native format of each type of data and requires the GIS
operator to change the formats explicitly when operation requires commonality of formats.
The computer program in both cases performs raster-to-vector and vector-to-raster conversion.
Most often when converting from vector to raster the results are visually satisfactory, but the

22
conversion techniques can produce results that are not satisfactory to the attributes each grid
cell represents. It is particularly true along the edges of areas, where the user seldom knows
the decision rules concerning how the partial cells are handled.

Alternatively, by converting from raster to vector, you may preserve the vast majority of
the attribute data, but the visual results will often reflect the blocky, step-like form. The size
of the grid cells from which conversion proceeds is an important factor controlling the
"blockiness" of the resulting vector.
Different mathematical smoothing algorithms can minimize this effect.

23

You might also like