0% found this document useful (0 votes)
23 views48 pages

GIS111 Chapter-5

The document outlines the principles of Geographic Information Systems (GIS), focusing on data sources, collection, and entry. It details the characteristics of geographic data, types of data representation, and various sources including existing GIS data, GPS surveying, and satellite imagery. Additionally, it distinguishes between primary and secondary data sources, emphasizing the importance of accurate data capture for effective GIS applications.

Uploaded by

Dems Weldeyesus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views48 pages

GIS111 Chapter-5

The document outlines the principles of Geographic Information Systems (GIS), focusing on data sources, collection, and entry. It details the characteristics of geographic data, types of data representation, and various sources including existing GIS data, GPS surveying, and satellite imagery. Additionally, it distinguishes between primary and secondary data sources, emphasizing the importance of accurate data capture for effective GIS applications.

Uploaded by

Dems Weldeyesus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Principles of Geographic Information

Systems
GeES 332
GIS Data Sources, Collection,
and Entry
Presented by:
Daniel Alemayehu
Chapter Five Outline
• 5.1 Data in GIS
• 5.2 Characteristics of Geographic Data
• 5.3 Representation of data
• 5.4 GIS data sources
5.4.1 Existing GIS Data
5.4.2 Available Map
5.4.3 GPS Surveying
5.4.4 Tabular Data
5.4.5 Digital Orthophoto and Aerial Photographs
5.4.6 Satellite Imagery
• 5.5 Primary and Secondary data sources
5.5.1 Primary data sources
5.5.2 Secondary data sources
5.5.3 Primary Geographic Data Capture
5.5.4 Secondary Geographic Data Capture
• 5.6 Representing Geographic features and
Data Entry
5.6.1 Point Representation
5.6.2 Line Representation
5.6.3 Area Representation
• 5.7 GIS Data Format
Data in GIS
• Geographic data consists of two elements,
namely spatial and descriptive.
• The first one gives information about the
feature’s geometrical orientation, shape, size and
relative position with respect to other features,
and
• The second qualifies information about various
attributes like area, length or population.
• In the GIS domain, the spatial part of a
geographical feature is called spatial data and
the attribute part is known as non spatial or
attribute data.
• Spatial data is described by x, y coordinates and
descriptive data are best organized in alpha
numeric fields.
• Normally, Spatial and attribute data are stored
separately to a GIS and links are established
between the two types of data.
• Broadly categorized, the basic data for any GIS
application has two components
• a. Spatial data: consisting of maps and which
have been prepared either by field surveys or by
the interpretation of remotely sensed data. Ex.
soil survey map, geological map, land use map,
etc. which could be available in analog or digital
format.
• b. Non spatial data: - Attributes complimentary to
the spatial data and that describe what is at a
point, along a line or in a polygon.
• Ex. soil depth, texture
• There are four fundamental types of Geographic
data to be input and stored in a GIS:
• Points, Lines, Polygons and Surfaces.
Characteristics of Geographic Data
• Geographic data consists of two elements
• 1. Spatial: - gives information about the features
geometrical orientation, shape, size and relative
position with respect to other features. described
by x, y coordinates
• 2. Descriptive: - qualifies information about
various attributes like area, length, or population.
• They are best organized by alphanumeric fields.
• In the GIS domain the spatial part of a
geographical feature called spatial data and the
• attribute parts is known as non-spatial or
attribute data.
• Normally, spatial and attribute data are stored
separately in a GIS and links are established
between the two types of data.
• Tabular data are attribute
(characteristics/elements) data's. Some of the
attribute data are census records which
represent population, occupation etc.
• These are tables which consist of rows
representing samples and columns representing
data parameters values.
• The tabular data incorporated into GIS as
relational tables
Representation of data

• The measurement of the data pertains to the


description of what the data represents- a
naming or lagending or classification function
and the calculation of their quantity – a counting
or scaling or measurement function.
• Thus scaling of the data is important while
organizing a GIS database.
• There are four scales of by which data are
represented.
• A. Nominal, where the data principally classified
into mutually exclusive sets or levels based on
relevant characteristics.
• The nominal scale is the commonly used measure
for spatial data and can be of two types:

• i. Dichotomous or presence and absence data.


This type of data mainly is a logical definition of
data characteristics and is also referred to as
Yes/No data.
• It mainly applies where data is classified in to one
of two categories- a village may or may not have
a school facility, a road may or may not have
lanes, and a city may or may not have an air port.
• ii. Categorical data. This measurement scale is
used when data is classified into one of several
categories- different rock types, different
vegetation covers, different administrative units.
• The land use information on a map representing
the different categories of land uses is a
categorical representation of data on a nominal
scale.
GIS data sources
• A tremendous amount of in situ spatial information will
continued to be collected to address important urban
and environmental problems.
• Much of this information now placed in GIS.
• The GIS efficiently store retrieves manipulates
analyzes and displays these data according to user
defined specifications.
• The data used in a GIS represented something about
the real world at some point in time.
• The most cost effective data collection is to collect
only the data you need.
• It is costly to collect, store and shift through large
quantities of unnecessary data.
• Excess data makes it more difficult to use the data you
really need.
• B. Ordinal, which is a more sophisticated and
orderly representation as the classes are placed
into some form of rank order based on a logical
property of magnitude. There are two types of
ranked data- where a set of categories has a
natural ranking associated with it.
• For example, grades of agricultural land or
ground water prospect map showing “prospect
levels” are of this type.
• Alternatively a set of places may be ranked
according to some criterion, say, on the bases of
population density or pollution levels.
• C. Interval, which is continuous scale of
measurement and is a crude representation of
numeric data on a scale.
• Here, the class definition is a rank order where the
differences between the ranks are quantified.
• The representation of population in rank order is an
• example of interval data;
• D. Ratio, which is also continuous scale where the
origin of the scale is real and not imaginary.
• Furthermore, ratio intervals represent the scaling
between individual observation in the data set and
not just between data sets.
• An example of the ratio scale is when each value is
normalized against a reference – generally as
average or maximum or minima.
Existing GIS Data
• One method of building a GIS database is to
simply purchase GIS data another entity has
collected.
• Existing digital map data can usually be
purchased for much less than the cost of
creating it.
• This is especially true if the seller has many
buyers.
• The seller can recover the initial investment over
a larger number of sales; therefore, the seller
can charge a lower price on each purchase.
• A typical county or large city may have to spend
hundreds of thousands of dollars to obtain new
digital topographic mapping for a GIS base map.
On the other hand, it may be able to purchase
data containing the same data themes (e.g.,
roads, drainage topography, buildings, and
vegetation), and covering the same area, for a
few hundred dollars. What is the catch?
• Well, it a big one. The horizontal and vertical
accuracy of less expensive existing data is likely
be much lower than that of the new topographic
mapping, and is likely to provide far fewer details
of map features.
Available Map
• Available map data is the most important form of
source of data for a GIS application.
• Maps of various scales, sizes, formats and time
periods representing different features-like soil,
geology, cities, contours, and elevation,
cadastres etc are major sources for the GIS data
base.
• Digitizing is the process of tracing paper maps
into a computer format.
• The term was coined to describe the fact that the
maps are stored as digits in the GIS database.
Most vector GIS data is collected by this
method.
• From the late 1970s to the 1980s, a digitizing
table such as that shown in the following
illustration was used almost exclusively.
Figure: Manual Table Digitizing
• When using a digitizing table, the paper map is
carefully taped down on the table’s surface.

• A grid of fine wires is embedded in this surface.


This grid senses the position of the crosshair on
a hand-held cursor. When the cursor button is
depressed, the system records a point at that
location in the GIS database.

• The operator also identifies the type of feature


being digitized, or its attributes. In this way the
map features can be traced into the system.

• A process more commonly used today is to first


digitize the entire paper map using a scanner,
such as that shown in the following illustration,
capturing it as a raster image.
GPS Surveying
• Survey data and records are carried out to
record the status of a resource, like a geological
• survey for soils or topographical survey for
elevation and other features.
• The data is represented in map form or in
coordinate geometry form.
• One alternative to traditional manual digitizing to
be explored uses Global Positioning System
(GPS) survey techniques. GPS satellites are
equipped with atomic clocks, computers, and
transmitters, each satellite broadcasting 24
hours a day.
• Reading the signal from at least four satellites,
GPS receivers on earth are able to determine
both their elevation and their location on the
earth’s surface.
• During the 1980s, mapping companies began
using GPS survey techniques to set the ground
control for aerial photography and
Photogrammetric mapping projects. The cost
was much lower and the results more reliable.
• This process is used to locate utility lines,
wetlands boundaries, park improvements, and
so forth. It can also be used to verify and edit
GIS data in the field, greatly reducing the time
and cost required to “ground truth” data derived
from satellite imagery and aerial photographs.
Tabular Data

• Tabular data are attribute data's. Some of the


attribute data are census records which
represent population, occupation etc.
• These are tables which consist of rows
representing samples and columns representing
data parameters values.
• The tabular data incorporated into GIS as
relational tables.
Digital Orthophoto and Aerial Photographs
• An orthophoto is a rectified aerial photographic
image; that is, the relief displacement or radial
distortions, which are both inherent in aerial
photos, have been eliminated.
• The orthophoto is geometrically equivalent to a
conventional line map, which represents
planimetric features on the ground in their true
orthographic positions.
• Because of this, orthophotos possess the
advantages of line maps, such as the ability to
make measurements of distances, angles, and
areas.
• However, orthophotos, unlike line maps, also
contain the images of an infinite number of
ground objects.
• In the past, the idea of creating digital
orthophotos was out of the question, largely due
to the difficulty of storing large amounts of data,
and lack of technology that could provide
enough power to produce the end product
quickly and at a reasonable cost.
• Today, computer power and storage have
reached a level in speed and cost that allow
digital orthophotography to be a commercial
reality.
• Digital orthophotos provide all the information of
a photograph, but at the same time allows the
registration of vector maps used in GIS.
Satellite Imagery
• Which are classic sources of data on natural
resources for a region and provides a record of
the continuum of resource status because of the
availability of representative coverage?
• It can be used for to study and monitor land
features, natural resources, and dynamic
aspects of human activities and towards
preparation of thematic maps depicting various
resource statuses.
• Satellite imagery can also be used as a raster
backdrop to vector GIS data, Satellite images
have supported numerous GIS applications,
including environmental impact analysis, site
evaluation for large facilities, highway planning
the development and monitoring of
environmental baselines, emergency and
disaster response, agriculture, and forestry.
• Satellite images are especially useful for urban
planning and management, where they are used
to detect areas of change, monitor traffic
conditions, measure water levels in reservoirs
and building heights, and many other
applications. They are also particularly helpful
for agriculture and forestry, providing information
for crop and forest identification and inventory,
growth and health monitoring, and even
measuring tree heights.
• In addition to image analysis, satellite images
can also be used to create “image maps”, that is,
maps that combine the raster satellite image
with vector line work and text that delineate
special features, such as boundaries, roads, and
transmission lines.
• There are four important aspects of satellite
imagery: spatial resolution, spectral resolution,
temporal resolution, and extent.
Primary and Secondary data sources
• Based on their source, geographic data are
classified into two kinds:
• 1. Primary data and
• 2. Secondary data.
Primary data sources
• Primary data sources
• Primary data are direct measurements. Primary
data collection is necessary when a researcher
cannot find the data needed in secondary
sources.
• Three basic means of obtaining primary data are
observation, surveys, and experiments. The
choice will be influenced by:
• 1. The nature of the problem.
• 2. The availability of time and money
• Some of the sources of primary data are:
• 1. Ground Surveys
• 2. GPS (Global Positioning System)
• 3. Aerial Photographs
• 4. Satellite Images

Secondary data sources
• Secondary data sources
• Secondary data is data which has been collected by
individuals or agencies for purposes other than
those of our particular research study.
• They are data derived by processing primary data
or other secondary data.
• Example: If a government department has
conducted a survey of, family food expenditures,
then a food manufacturer might use this data in the
organization's evaluations of the total potential
market for a new product.
• SOURCES of SECONDARY DATA
• 1. Digitized Paper Maps
• 2. Interpolated Surfaces
• 3. Scanned and Processed Images
• 4. Available map data
Primary Geographic Data Capture
• Raster data capture
• Some of the ways of capturing primary raster
data are the following:
• Remote sensing
• Remote sensing is a technique used to derive
information about the physical, chemical, and
biological properties of objects without direct
physical contact.
• Information is derived from measurements of the
amount of electromagnetic radiation reflected,
emitted, or scattered from objects.
• Aerial photographs
• Aerial photography is equally important in
medium- to large-scale projects. Photographs
are normally collected by analogue optical
cameras and later scanned. It is a data collected
in the raster format.
• Vector data capture
• Two main branches are ground surveying and
GPS. Ground surveying is based on the principle
that the 3-D location of any point can be
determined by measuring angles and distances
from other known points. Traditional equipment
like transits and theodolites has been replaced
by total stations that can measure both angles
and distances to an accuracy of 1 mm.
Secondary Geographic Data Capture
• Raster data capture using scanners
• Three main reasons to scan hardcopy media
are:
• Documents are scanned to reduce wear and
tear, improve access, provide integrated
database storage, and to index them
geographically.
• Film and paper maps, aerial photographs, and
images are scanned and georeferenced so that
they provide geographic context for other data.
• Maps, aerial photographs, and images are
scanned prior to vectorization.
• Vector data capture
• 1. Manual digitizing
• Digitizers operate on the principle that it is possible
to detect the location of a cursor or puck passed
over a table inlaid with a fine mesh of wires.
• Vertices defining point, line, and polygon objects
are captured using manual or stream digitizing
methods.
• 2. Heads-up digitizing and vectorization
• Vectorization is the process of converting raster
data into vector data.
• The simplest way to create vectors from raster
layers is to digitize vector objects manually straight
off a computer screen using a mouse or digitizing
cursor.
Representing Geographic features and
Data Entry
• Vector data represents geographical features by
a set of coordinates. Vectors as x, y coordinates,
define points, lines and polygons.
• The basic premise of the vector based
structuring is to define a two dimensional space
where features are represented by coordinates
on the two axes.
• In vector representations, an attempt is made to
explicitly associate georeferences with the
Geographic phenomena.
• A georeference is a coordinate pair from
geographic space, and is also know as vector.
This explains the name.
• GIS feature can be classified in to four
categories, three of which pertain to spatial data
and the fourth attribute data.
• a. Points: points are features having a specific
location but without extent in a direction and are
represented by a pair of coordinates. Village
locations, cities and so on are examples point
data. On maps they are presented by specific
symbols;
• b. Lines: line features represent linear features
and consists of a series of x, y coordinate pairs
with discrete beginning and ending points. Line
feature have length attributes. Rivers, streams,
road networks and so on are examples of line
data.
• c. Polygon: polygons are closed features defined
by a set of linked lines enclosing an area.
Polygons are characterized by area and
perimeter. Administrative boundaries, city
boundaries, etc. are polygon features.
• d. Attributes: are either the qualitative
characteristics of the spatial data or descriptive
information about the geographical features.
Point Representation
• Points are defined as single coordinate pairs (x, y)
when we work in 2D or coordinate triples (x,y,z)
when we work in 3D.
• Points are used to represent objects that are best
described as a shape and size less, single locality
features.
• Weather this is the case really depends on the
purpose of the spatial application and also the
spatial extent of the objects compared to the scale
applied in the application.
• For a tourist city map, parks will not usually be
considered as point features, but perhaps
museums will be and certainly public phone both
could be represented as point features.
• Besides the georeference, usually extra data is
stored for each point object. This so called
administrative or thematic data can be capturing
anything that is considered relevant about the
object. For phone both objects, this may include
the owning telephone company, the phone
number.
Line Representation
• Line data are used to represent one dimensional
object such as roads, railroads, canals, rivers,
and power lines. Again, there is an issue of
relevance for the application and the scale that
the application requires.
• For example, the application of mapping tourist
information, bus, subway and street car routes
are likely to be relevant line features. Some
cadastral systems, on the other hand, may
consider roads to be two dimensional features,
i.e. having a width as well.
• The two end nodes and zero or more internal
nodes define a line.
• In another word for internal node is vertex;
another phrase for line that is used in some
GISs is polyline, arc or edge. Anode or vertex is
like a point but it only serves to define the line; it
has no special meaning to the application other
than that.
• The vertices of a line help to shape it, and to
obtain a better approximation of the actual
feature. The straight parts of a line between two
consecutive vertices or end nodes are called line
segment.
Area Representation

• When area objects are stored using a vector


approach, the usual technique is to apply a
boundary model.
• This means that each area fetcher is
represented by some arc/node structure that
determines a polygon as the area’s boundary.
GIS Data Format
• In general, the two most widely used types of
GIS data structure are vector format and raster
format.
• The vector and raster models for storing
geographic data have, respectively, unique
advantages and disadvantages, both of which
models can be handled by a full-function GIS.
• The following illustration shows the various ways
these two data models would represent the
same map features.
• Vector Data
• Vector digital map data is recorded as distinct
points, lines (a series of point coordinates), or
areas (Shapes bounded by lines).
• In the vector model, information about points,
lines, and polygons is encoded and stored as a
collection of x, y coordinates.
• Raster Data
• Raster data files consist of rows of uniform cells
coded according to data values.
• An example would be land cover classification.
The computer can manipulate raster data files
quickly, but they are generally less detailed than
vector data.
• The degree of approximation is related to the
size of the cells.
• Halving the grid spacing will quadruple the
number of cells to manipulate. Maps plotted from
raster data may be less visually appealing than
vector data files, which have the appearance of
more traditional hand-drafted maps.

• For these reasons, the raster data model is


generally used to model continuous map
features. Like vector data, raster data can have
attribute data attached to individual cells. This
data can include map feature attributes such as
types, measurements, names, values, dates,
and classes.
• Raster Images
• A different type of raster data should also be
mentioned. Photographs, drawings, paper maps,
and other documents can be scanned into a raster
digital format and attached to spatial data elements
as attributes.
• Scammed raster images or related documents
could likewise be linked to the tax parcels. These
might include deeds, licenses, inspection reports,
permits, and photographs of lot improvements,
such as buildings. Click on the display of the tax
map, and the GIS retrieves and displays the
attached raster image.
• The cells of these raster images are usually
referred to as pixels (short for “picture
elements”) and are generally defined in one of
three ways.

• Black-and-white drawings are typically scanned


into a binary format in which each pixel is coded
as either black or white.

• Continuous tone photographs and maps are


typically scanned and stored as color pixels.

• Scanners are available for each type of task.


• Binary raster files are relatively small in
comparison to grayscale and color raster files.

• Each grayscale or color pixel requires a much


longer data record to describe it than a pixel
defined simply as black or white.

• Moreover, binary raster data can be greatly


compressed. Data compression is a process in
which continuous cells of the same color value
are coded together as a group.

• Obviously, the binary raster file of the scanned


black-line engineering drawing has many white
pixels that can be compressed in this manner,
greatly reducing its size.

You might also like