0% found this document useful (0 votes)
11 views86 pages

Chapter Three: 3. Data Management and Processing Systems

Chapter Three discusses data management and processing systems, focusing on databases and database management systems (DBMS) that facilitate the storage, manipulation, and querying of geographic data. It covers the advantages and disadvantages of DBMS, types of geodatabases, and the steps involved in spatial database design. The chapter emphasizes the importance of properly modeling reality through databases to manage and analyze spatial information effectively.

Uploaded by

bonsadefara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views86 pages

Chapter Three: 3. Data Management and Processing Systems

Chapter Three discusses data management and processing systems, focusing on databases and database management systems (DBMS) that facilitate the storage, manipulation, and querying of geographic data. It covers the advantages and disadvantages of DBMS, types of geodatabases, and the steps involved in spatial database design. The chapter emphasizes the importance of properly modeling reality through databases to manage and analyze spatial information effectively.

Uploaded by

bonsadefara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 86

Chapter Three

3. Data management and processing


systems
INTRODUCTION
Database
➢ We create "models" of reality that are intended to have some similarity with
selected aspects of the real world.

➢Databases are created from these "models" as a fundamental step in coming to


know the nature and status of that reality and can be thought of as an integrated
set of data on a particular subject.

➢Geographic/Spatial database is simply databases containing geographic data


for a particular area and subject.
Database Contd…
➢Database is essential for storing large quantities of data, enable multiple
users at the same time, data integrity, system crash recovery and easy to use
data manipulation language.
➢ Databases can be physically stored in files or in specialist software programs
called database management systems (DBMS).
➢DBMS is general purpose software system that facilitates the process of
defining, constructing and manipulating databases for various applications.
➢It is a collection of programs that enables users to create and maintain a
database.
➢The core of a GIS is therefore a
DBMS handling the storage and
management of the data, as well as
interaction with users.
➢Before using such a system, the
objects of interest have to be
modelled then find appropriate
structures to store all the data.
➢Today, most large organizations
use a combination of files and
DBMS for storing data assets.
Data Base Management System
A DBMS in GIS facilitates the process of:
➢Defining a spatial database; that is, specifying spatial and non-spatial
data types, structures, and constraints, such as coordinate systems, topology
rules, and spatial relationships.
➢Constructing the database; storing geographic data, including vector
layers, raster imagery, and attribute information, in persistent storage.
➢Manipulating the database. controlling access, managing layers, and
ensuring data integrity
➢Querying the database retrieving specific geographic features based on
location, attributes, or spatial relationships
➢Updating the database: modifying attribute values, updating geometries,
and maintaining real-time geographic information.
Advantage & Disadvantage DBMS
Advantage Disadvantage

•A DBMS allows organizations to define • The cost of acquiring DBMS software can
and enforce security and standards for data be quite high.
and data access.
• A DBMS adds complexity to the problem
•DBMS are better suited to managing large of managing data, especially in small
numbers of users simultaneously working projects.
with vast amounts of data.
• Regular maintenance and administration
•A DBMS can be instructed to guard over are required.
some levels of data correction
Spatial Database/Geodatabase
• It is a relational database that contains spatial and non-spatial objects.

• It furnishes the data organizational structure and workflow process model


for the creation and maintenance of the core data product.

• It is the heart of a GIS’s management capability.

• A geodatabase is the top-level unit of geographic data. It is a collection of


datasets, feature classes, object classes, and relationship classes
Spatial Database/Geodatabase
• Spatial data can be stored in a special database column, referred to as the
“geometry” or “feature” or “shape data type”, depending on the specific
software package.
• This means GISs can rely fully on a DBMS support for spatial data, making
use of a DBMS for data query and storage (and multi-user support), and a
GIS for spatial functionality.
• Spatial databases, also known as geo-databases, are implemented directly on
existing DBMSs using extension software to allow them to handle spatial
objects.
• A GIS database is composed of all of the spatial information (maps,
imagery) and associated attribute information (tables, reports).
Spatial Database/Geodatabase
➢Data in geodatabase usually organized into broad categories of data such as
land base, transportation, environment, and utility infrastructure
• Feature datasets: define a scope for a particular spatial reference. All
feature classes that participate in topological relationships with one another,
for example, a geometric network or a topology, must have the same spatial
reference
• Topologies: Many vector datasets have features that could share boundaries
or corners. set up rules defining how features share their geometry.
• Geometric networks :Some vector datasets, particularly those used to
model communications, material or energy flow, or transportation networks,
need to support connectivity tracing and network connectivity rules.
Spatial Database/Geodatabase
• Relationship classes: define relationships between objects in the
geodatabase. It can be simple one-to-one relationships, such as you might
create between a feature and a row in a table, or more complex one-to-many
(or many-to-many) relationships between features and table rows.
• Object classes: is a table in a geodatabase with which you can associate
behavior. Object classes keep descriptive information about objects that are
related to geographic features, but are not features on a map.
TYPES OF GEODATABASES
Personal File Desktop Workgroup Enterprise

Departmental
Original Desktop Improved Desktop Distributed data or Large capacity and
Functionality format format project-level use
projects or small
user base
organizations

File folder; Displays SQL Server, Oracle,


Storage Microsoft Access Microsoft SQL Microsoft SQL
.gdb extension in PostgreSQL, DB2,
mechanism database (.mdb) Server Express Server Express
ArcCatalog Informix

2 GB per
1 TB per object; Limited by relational
geodatabase; 10 GB per database 10 GB per database
Storage limit effective limit
configurable to
server server
database and
256 TB hardware
~500 MB
Three concurrent
One editor per Ten concurrent users,
User limit database
One editor per object users, one of which
all can edit
Unlimited
can edit
Platform Windows Any Windows Windows Any

ArcGIS for Desktop ArcGIS for Desktop ArcGIS for Desktop ArcGIS for Server ArcGIS for Server
Licensing - Any - Any - Standard - Workgroup - Enterprise
- Advanced
➢ Personal Geodatabases(single ➢Multi-user Geodatabases(ArcSDE
user geodatabase): geodatabases)
✓ used in ArcGIS since their initial release
✓Stored in a relational database using Oracle, Microsoft
✓ use the Microsoft Access data file structure (the SQL Server, IBM DB2, IBM Informix, or PostgreSQL.
.mdb file) and Jet Engine
✓Require ArcSDE and a DBMS ( Data Base Management
✓ Can be viewed by multiple users but edited by Systems).
only one user at a time.
✓ Have a maximum size of 2 Gigabytes (GB)or less. ✓Can be read and edited by multiple users at the same
time.
✓ Do not store raster data.
✓Can store raster data
✓ only supported on the Microsoft Windows
operating system. ✓Better storage capacity than single user geodatabase
• File geodatabase
• It is a relational database storage format. It’s a far more complex data structure than the
shapefile and consists of a .gdb folder housing dozens of files.
• It store multiple feature classes and enabling topological definitions (i.e. allowing the
user to define rules that govern the way different feature classes relate to one another
• It provide a portable geodatabase that works across operating systems.
• Scale up to handle very large datasets.
• Datasets that can scale beyond 500 GB per file with very fast performance.
• It allow users to compress vector data to a read-only format to reduce storage
requirements even further.
FILE GEODATABASE
TERMINOLOGY IN SPATIAL DATABASES
1. Geographic Objects:
➢A geographic object corresponds to an entity of
the real world and has two components.
✓ Description. The object is described by a set of
descriptive attributes.

➢These are also referred to as alphanumeric


attributes (containing both letters and numbers).
Ex: The name and population of a city constitute
its description.
✓ Spatial component, which may embody both
geometry (location in the underlying geographic
space, shape, and so on) and topology (spatial
relationships existing among objects, such as
adjacency and connectivity).
2. Theme
➢A theme is a collection of geographic objects
that have a set of homogeneous characteristics
(i.e., objects having the same structure or type)

➢In a GIS, the geospatial information


corresponding to a particular topic is gathered in a
theme. (LULC, Transportation, Hydrology)

3.Maps
➢When a theme is displayed on paper or on-screen,
what the user sees is a map as it is commonly
displayed, with colours, a particular scale, a legend,
and so on.

➢Topographic maps, railway maps, and weather


maps are examples of maps commonly used.
Element/Components of Spatial Database
Elements of reality modeled in a GIS database have two identities:
✓ Entity: The element in reality
✓ Object: The element as it is represented in the database
✓ Symbol: The graphical representation of the object/entity on a map or other visual display

Entity : Road in real world Object: The road as a line Symbol: A black or red line on a map
feature in a GIS database, that represents the road, possibly with
with attributes like name and labels.
road type.
Element/Components of Spatial Database
1. Entity
An entity is "a phenomenon of interest in reality that is not further subdivided into
phenomena of the same kind"
•e.g. a city could be considered an entity and subdivided into component parts but these
parts would not be called cities, they would be districts, neighborhoods or the like
2. Spatial Object:
An object is "a digital representation of all or part of an entity" .
•The method of digital representation of a phenomenon varies according to scale, purpose
and other factors.
•e.g. a city could be represented geographically as a point if the area under consideration
were continental in scale the same city could be geographically represented as an area if
we are dealing with a geographic database for a state or a county.
TYPES OF DBMS
DBMS can be classified according to the way they store and manipulate
data.
Three main types of DBMS are available to GIS users today:
1. Relational (RDBMS)
2. Object (ODBMS)
3. Object-relational (ORDBMS)
1. A relational database comprises a set of tables, each a two-
dimensional list (or array) of records containing attributes about the
objects under study.
2. Object database management systems (ODBMS) were initially
designed to address several of the weaknesses of RDBMS. These
include the inability to store complete objects directly in the
database.
3. Object-relational DBMS (ORDBMS): This is largely because of
the massive installed base of RDBMS and the fact that RDBMS
vendors have now added many of the important ODBMS
capabilities to their standard RDBMS software systems to create
hybrid object-relational DBMS (ORDBMS).
Spatial Database Design
➢Database design involves
identifying important phenomena and
choosing the appropriate data
representation for storage.
➢The overall goal or target of
database design is to maintain data
consistency/integrity, reduces data
redundancy, increase system
performance, maintain maximum user
flexibility, and create a useable system
Spatial Database Design
However, database design is highly influenced by:
✓ Applications,
✓Data format and size,
✓Data maintenance and update,
✓Hardware/software,
✓Number and sophistication of users,
✓Schedule and budget of the project, management approach
Steps in Spatial database design
➢Database design involves the following three major
steps conceptual, logical, and physical models.
1. Conceptual Model Conceptual Model
➢ Model the user’s view.
This involves tasks such as identifying organizational 1. User View
functions
➢ Define objects and their relationships. 2. Objects and
Relationship
The object types (classes) and functions can be
specified. 3. Geographic
➢ Select geographic representation. Representation
Choosing the types of geographic representation
(discrete object – point, line, and polygon – or field).
2. Logical Model
➢Match to geographic database types.
This involves matching the object types
to be studied to specific data types Logical Model
supported by the GIS that will be used to 4. Geographic
create and maintain the database. Database Types

5. Geographic
➢Organize geographic database structure. Database Structure
➢ This includes tasks such as defining
topological associations, specifying rules
and relationships, and assigning
coordinate systems
3. Physical Model
➢Define database schema. The final stage is definition of
the actual physical database schema that will hold the
database data values
Physical Model
➢A schema is a compact graphical representation of the
conceptual model, the entities and the relationship among 6. Database Schema
them.

➢The relationship may be one-to-one, between one entity and


another, or they may be one-to-many, or many-to-many
connecting several objects.

➢These relationships are represented by lines connecting the


entities and many indicate if the relationships are between
one or many entities
STEPS IN SPATIAL DATABASE DESIGN

Conceptual Model
Logical Model
1. User View Physical Model
4. Geographic
Database Types
2. Objects and 6. Database Schema
Relationship 5. Geographic
Database Structure
3. Geographic
Representation
Chapter Four

4. Spatial Data Referencing


Shape of the Earth
➢For many mapping applications, the earth can be assumed to
be a perfect sphere.

➢In reality, the earth resembles more closely then the sphere a
figure called an ellipsoid or spheroid.

➢The shape and size of a geographic coordinate system’s


surface is defined by a sphere or spheroid.

➢The earth is sometimes treated as a sphere to make


mathematical calculations easier.

➢We must choose a geometric model (known as a geodetic


datum) that closely approximates the shape of the Earth, yet A perfect sphere Ellipsoid or Spheroid
can be described in simple mathematical model of transferring
locations from the idealized Earth model to the chosen planar
coordinate system
Geoid
❑Geodesy: study the size and shape of the earth, the position of
points on the earth's surface, and the dimensions of areas so
large that the curvature of the earth must be taken into
account. It is the branch of science concerned with measuring
the size and shape of the Earth.
❑The geoid is commonly referred to as the Mean Sea Level
(MSL) surface and serves as a vertical datum.
❑Geoid is an equipotential surface (a surface on which gravity
and centrifugal forces are balanced).
❑At geoid the Earth’s gravity is constant.
❑The size and shape of the best fitted ellipsoid, as well as its
location relative to the centre of mass of the Earth, differs
from place to place.
❑Gravity anomaly is the elevation difference between a
standard shape of the earth (ellipsoid) and a surface of
constant gravitational potential (geoid
Geoid contd...
Ellipsoidal vs Orthometric height
➢ Ellipsoidal height refers to elevation values above or below an idealized
surface which approximates the shape of the earth as a spheroid.
➢ Eg. An example of an ellipsoid is WGS 84
• It can vary greatly from the local sea level
• Orthometric height refers to elevation
values above or below a geoid model
surface; the geoid approximates local sea
level.
Ellipsoid /spheroid Fitting
•Ellipsoid must be fitted to the geoid
•Best fit would be ideally when the geoid and
ellipsoid exactly coincided.
•Not possible over the entire geoid so regional
fitting is often used.
Datum
•Is a set of parameters defining a coordinate system, and a set of control points
whose geometric relationships are known, either through measurement or
calculation
•Defined by a spheroid, which approximates the shape of the Earth, and the
spheroid’s position relative to the center of the Earth
•Provides a frame of reference for measuring locations on the surface of the
earth.
•Defines the origin and orientation of latitude and longitude lines.
Datum
➢ Geocentric datum
•An earth-centered, or geocentric, datum uses the earth's
center of mass as the origin
➢ Local datum
•A local datum aligns its spheroid to closely fit the
earth's surface in a particular area.
•A point on the surface of the spheroid is matched to a
particular position on the surface of the earth and the
origin point is referred as datum.
•The coordinate system origin of a local datum is not at
the center of the earth(offset from the earth's center)
•NAD 1927 and the European Datum of 1950 (ED 1950)
are local datum's
Spatial Reference Line

• Equators
• Prime Meridians
• Parallels
• Meridians
PARALLELS(A) and MERIDIANS(B)
• Parallels run parallel to each other in an east
west direction around the Earth.

• The Meridians are geographic north/south


lines that converge at the poles.

• The Prime Meridian passes through


Greenwich, England or Accra has the reading
of 0°.

• Using the prime meridian as a reference, we


can measure the longitude value of a point on
the Earth's surface as 0° to l80° east or west of
the prime meridian.

• Similarly, using the Equator as 0° latitude, we


can measure the latitude value of a point as 0°
to 90°north or south of the equator
How spatial data stores location
+90
• Coordinate is a set of numbers that
designate location in a given reference
system, such as, x, y in a plane coordinate
system or x, y, z in a three dimensional
coordinate system x,y x,y
-,+ +,+
• Coordinate pairs represent location on the -180 +18
earth's surface relative to other locations. 0,0
x,y x,y
-,- +,-

-90
X axis = Equator
Y axis = Prime Meridian
Why referencing and projection
❖The purpose of reference system and
projections are:
✓Creating spatial data (collecting GPS
data)
✓Import into GIS and overlay with
other layers
✓Acquiring spatial data from other
sources
✓Display your GPS data using maps
✓Describes where features are located
in the real world
Geographic Coordinate Systems(GCS)
•Geographic coordinate systems consist angles of latitude, which varies from north to south,
and longitude, which varies from east to west. A point is referenced by its longitude and
latitude values.
•Consider earth as a 3-Dimensional Spherical Surface (ellipsoid) spherical model of the
earth , any location on the earth surface is defined by an angular unit of measure like
degrees, prime meridian, and a datum.
•The position of any point is defined by the intersection of both imaginary lines.
•They are lines of equal/constant latitude and longitude.
•The line of latitude midway between the poles is called the Equator. It defines the line of
zero latitude.
Geographic Coordinate Systems(GCS)
• Reference system based on a 3D spherical surface

• A three-dimensional system adds the height or elevation to


create an X, Y, Z position.

• Angular distance of latitude/longitude


• Degrees/minutes/seconds
• 117º11’44.326”W, 34º3’23.029”N
• Decimal degrees
• -117.195646, 34.056397
• A minute is 1 / 60 of a degree, and a second is 1 / 60 of a
minute or 1 / 3600 of a degree

For example:
• The latitude 41° 27 minutes (‘) and 41 seconds (‘‘) north =(41° 27´ 41´´N)
• To transfer geographic coordinates into decimal degrees you can use the following
calculation: Decimal degrees = Degrees + Minutes/60 + Seconds/3600
Projected or Cartesian Coordinate System
•Reference systems, called rectangular coordinates or plane coordinates, allow us to locate objects
correctly on flat maps (Two-dimensional maps projected from reference globe).
•It is based on a sphere or spheroid geographic coordinate system, but it uses linear units of measure
for coordinates, so that calculations of distance and area are easily done in terms of the same units.
•Has constant lengths, angles, and areas across the two dimensions
•The basic rectangular coordinate system consists of two lines:
•Abscissa (X-coordinates):is a horizontal line that contains equally spaced numbers starting from 0,
called the origin, and extending as far as we wish to measure distance in either of two directions.
•Ordinate (Y-coordinates): The ordinate allows us to move vertically from the same point of origin in
a positive or negative Y direction.
• Units generally in meters or feet

• A two-dimensional system X(Easting) and Y(Northing), to describe a horizontal position on the


Earth
Geographic Coordinate System Projected Coordinate System
Deals with earth in 3D Deals with earth in 2D
Large (area) coverage Small (area) coverage
Easy to identify location in a globe Easier to calculate spatial locations and
relationships
Less distorted but more difficult to Easier to work with, but quantities like
work with distances and angles are often distorted
due to map projection
Although longitude and latitude can Projected coordinate system has
locate exact positions on the surface of constant lengths, angles, and areas
the globe, they are not uniform units of across the two dimensions
measure
WHAT TYPE OF MAP PROJECTION SHOULD YOU CHOOSE?
❑Selecting appropriate projection can be distinguished by its suitability for representing a
particular portion and amount of the earth's surface and by its ability to preserve distance,
area, shape, or direction
❑Few things to consider when choosing a projection
➢ Which spatial properties do you want to preserve?
➢ Where is the area you're mapping?
➢ What shape is the area you're mapping?
➢ How big is the area you're mapping?
Types of map projections
❑Map projections based on what spatial attribute they preserve.
•Equal Area projections(area) - correct earth surface area (Albers Equal Area Conic
projection) important for mass balances
•Conformal projections (Shape) - local angles are shown correctly (Lambert Conformal Conic
and Mercator projections ). drawback =areas enclosed by a series of arcs may be distorted
•Azimuthal projections (Direction) - all directions are shown correctly relative to the center
(Lambert Azimuthal Equal Area)
•Equidistant projections (Distance) - preserved along particular lines , preserve distance
•Some projections preserve two properties (Example).
Distortions Caused by Map Projections

➢All projection types have distortions .The four


spatial properties that are subject to distortion are:

▪ Shape

▪ Area

▪ Distance

▪ Direction
•One map projection might be used for large-scale data
in a limited area, while another is used for a small-
scale map of the world
Map Projections
• Map Projection - the transformation of a curved earth to a flat map (3D to 2D)
• Parallels and meridians used as a base on which to draw a map on a flat surface
• transforms a position on the Earth’s surface identified by latitudes and longitudes into
apposition in Cartesian coordinates (x, y).
• All map projections are attempts to portray the surface of the earth on a flat surface
• Map projections can be classified in to three general families: Cylindrical, Conical, and
Azimuthal or Planar.

Conical Projection Cylindrical Projection Planar Projection


Planar or polar projection
• Surface of globe is projected onto a plane tangent at only one
point
• Used frequently at N or S pole
• Usually only one hemisphere shown (centered on N or S pole)
• For example: Lambert Azimuthal Equal Area
Conic projection
• Analogous to wrapping a sheet of paper around the earth in a cone
• Normally shows just one semi hemisphere in middle latitudes.
• Very popular for maps of East-West oriented land masses
• Example: Lambert Conformal Conic
Cylindrical projection
• Low distortion at equator, higher distortion approaching poles.
• A good choice for use in equatorial and tropical regions such as Ecuador, Kenya, Ethiopia,
Malaysia.
• Example: Mercator projection
UTM (Universal Transform Mercator)
• Perhaps the most prevalent plane grid system used in GIS operations is the
UTM.
• UTM allows precise measurement using the metric system of measurement
and has been adopted for remote sensing work, topographic map preparation
and natural resource database development.
• Commonly used for largescale digital cartographic data.
• Distinguishing characteristics of the UTM is that the x co-ordinates have 6
digits, but the y co-ordinates have 7 digits
• UTM divides the world into 60 zones with central meridians at 6 degree
intervals.
• The zones start at 180 degrees west, and the zone numbers rise as you travel
east.
Universal Transverse Mercator (UTM)
• Easting is measured from a zone’s central meridian, which is assigned a false
easting of 500,000 meters. Northing is measured relative to the equator,
which has a value of 0 meters for coordinates in the northern hemisphere.
• Northing values in the southern hemisphere decrease southward from a false
northing of 10,000,000 meters at the equator.
• You must choose either northern or southern hemisphere coordinates when
choosing a UTM zone
• The scale factor at the central meridian is 0.9996.
• The UTM coordinate system uses the Transverse Mercator map projection,
which minimizes shape distortions for small geographic features.
• Eg. Ethiopia is in zone 37 and Britain is in zone 30.
Works great for large scale data sets and satellite image rectification though
some areas cross zones
Modern Approaches to Map Projections
❑Web Mercator
• Many major online street mapping services (Bing Maps,
OpenStreetMap, Google Maps, MapQuest, Yahoo
Maps, and others) use a variant of the Mercator
projection for their map images.
• The projection is well suited as an interactive world
map that can be zoomed into seamlessly to large‐scale
(local) maps, where there is relatively little distortion due
to the variant projection's near-conformality.
• Web Mercator is the mapping of WGS84 datum (i.e.,
ellipsoidal) latitude/longitude into Easting/Northing using
spherical Mercator equations .
• This projection was popularized by Google in Google
Maps
• The reference ellipsoid is always WGS84, and the
spherical radius R is equal to the semi major axis of the
WGS84 ellipsoid a. That's "Web Mercator."
Chapter Five
5. Data entry and preparation
Spatial Data classification
• Digital Data Vs Analogue Data

Digital Analogue
easy to update whole map to be remade
easy and quick transfer (e.g. via internet) slow transfer (e.g. via post)
storage space required is relatively small large storage space required (e.g.
(digital devices) traditional map libraries)
easy to maintain paper maps disintegrate over time
difficult and inaccurate to analyze
easy automated analysis
(e.g. to measure areas and distances)
Spatial Data classification
❑Text and Imaginary data
▪ Text data includes
• Reports
• Documents and records
• Statistics and census
• Results of investigation and experiments
▪ Imaginary data includes
• Maps (digital map and analogue map)
• Photos
• RS (remote sensing, satellite images and
aerial photographs)
Data sources
• Primary data sources are those collected in
digital format specifically for use in a GIS
project.

• Secondary sources are digital and analog


datasets that were originally captured for
another purpose and need to be converted
into a suitable digital format for use in a
GIS project.
Data Sources for GIS
A wide variety of data sources exist for both spatial and attribute data. The most
common general sources for spatial data are:
• Remotely sensed satellite images
• Available hard copy map data
• Aerial photographs
• Tabular data
• Survey data and records
• Digitized and scanned data
• Existing digital Databases
• GPS field sampling/observations
• Reports
Data collection
• The processes of data collection are also
variously referred to as data capture, data
automation, data conversion, data transfer, data
translation, and digitizing.
• Data collection is a time consuming, tedious,
and expensive process.
• Typically it accounts for 15–50% of the total cost
of a GIS project
• If staff costs are excluded from a GIS budget,
then in cash expenditure terms data collection can
be as much as 60–85% of costs.
Stages in data collection projects
• Planning includes establishing user
requirements, garnering resources, and
developing a project plan.
• Preparation involves obtaining data,
redrafting, editing scanned map images,
removing noise.
• Digitizing and transfer are the stages where
the majority of the effort will be expended.
• Editing and improvement covers many
techniques designed to validate data, as well
as correct errors and improve quality.
• Evaluation is the process of identifying
project successes and failures.
Data Entry and Presentation
➢The first step of using GIS is to provide it with data
➢The acquisition and preprocessing of spatial data is
an expensive and time consuming process.
➢Data entries is procedure of encoding data in to a
computer readable format and write to GIS
software. key board, digitizers, scanners, Entry
of coordinates using coordinate geometry and
Conversion of existing digital data
➢Data entry process is error prone.
➢Therefore, data entry phase is critical and must be
taken seriously
GIS data entry
❑Data entries is procedure of encoding data in to a computer readable format and write to
GIS software. key board, digitizers, and scanners
• Graphical data conversion :creates digital map layers
• Attribute data conversion: produces tabular data files associated with graphical
elements on a layer
Digital data conversion process
• Acquisition: digitizing, purchasing, or
collecting primary data from field ,
purchasing from government agencies or
commercial data suppliers, GPS based
attribute and, remote sensing and
photogrammetric data
• Editing: clean the acquired data
• Formatting or translating: convert data
base format
• Linking graphical data to their
associated attribute data
Data entry by digitization
Indirect Data Capturing Method: Digitizing
•Digitizer: is an essential tool for converting printed map data into digital format.
•Components of a Digitizer: It consists of three main parts:
• Table – The surface on which the map is placed.
• Cursor – A pointer used to trace map features.
• Controller – The system that processes and records data.
On-tablet digitizing On-screen digitizing
•Ordinary mouse
• Uses a digitizing tablet and stylus
•Computer screen,
• Requires a physical map
•Points selection
• Less commonly used
•Can be done in two modes :point mode, stream mode
Map digitizing procedures
• Preparation for Digitizing
✓Checking:
✓Accuracy,
✓Completeness,
✓Identifying control points, known as ‘tic
points’, for registering the output digital
data to map coordinates.
Map digitizing procedures contd..
• Creating a Digitizing Template: Tic points, map neat lines and graphical
elements that are common to all layers.
• Map Digitizing: use tic points of the map and the ground coordinates of
the corresponding points entered through the keyboard.
✓Coordinates for corresponding objects on the map and on the screen
become identical.
✓Out in stream mode or point mode
❖Stream mode: digitizer generates coordinates automatically
❖Point mode: digitizer will generate coordinates only when the user
presses the button
✓The data file size in point mode is small, compared to stream mode.
Map digitizing procedures contd..
▪ Post digitizing Data processing (Graphical Data
Editing): is to ensure the integrity of the data before
they can be used in a geographic database
✓ Integrity can be maintained by creating data
free from errors
▪ Possible errors to be checked after digitization are:
✓ Lines intersect where they are expected to
intersect (i.e., no undershoot or overshoot)
✓ Node are created at all points where lines
intersect
✓ All polygons are closed
✓ Each polygon contains a label point
✓ The topology of the layer is built
Data cleaning after digitizing
Post scanning Data Processing
❑Scanning is a non-selective data conversion process in the sense that every point on
the map is captured
• Computer assisted or manual methods
• Depends on a variety of factors, including the quality of the original maps, the
complexity of the map contents as well as the functionality of the vectorization
software
❑Post scanning Data Processing
✓Raster-Vector Conversion
✓Raster Text Conversion
✓Raster Symbol Conversion
✓Graphical Data Editing
✓Attribute Data Tagging
1. Raster data capture
• Remote sensing
PRIMARY • Aerial Photographs
GEOGRAPHIC DATA
CAPTURE 2. Vector data capture
• Surveying
• GPS field mapping
• LiDAR
RASTER DATA CAPTURE
1. Remote Sensing
• Remote sensing is the process of
collecting spatial data from a distance
using sensors on satellites or aircraft.
• The data is stored in raster format,
where each pixel represents a value such
as temperature, vegetation, or land cover.
Examples: Landsat, MODIS, LiDAR
scanning
2 Aerial Photographs
• Captured from aircraft or drones to produce high-
resolution images.
• More detailed than satellite imagery and often used
for small-scale mapping.
• Aerial photographs can be:
• Vertical Photographs – Taken directly
overhead for topographic mapping.
• Oblique Photographs – Taken at an angle for
3D visualization.
VECTOR DATA CAPTURE
Surveying
• Ground surveying using Theodolites and
other ground based instruments
• Ground survey is a very time-consuming and
expensive activity, but it is still the best way to
obtain highly accurate point locations.
• Typically used for capturing buildings, land
and property boundaries,
• Also employed to obtain reference marks for
use in other data capture projects
GPS Field Mapping
• Real-time data collection: GPS devices
capture point, line, and polygon features
with geographic coordinates.

• High accuracy mapping: Used for


navigation, land surveying, and spatial
data collection in GIS applications.
LiDAR
• Relatively new technology that
employs a scanning laser range
finder to produce accurate
topographic surveys

• Typically carried on a low-altitude


aircraft that also has an inertial
navigation system and a differential
GPS to provide location.
1. Raster data capture
• Raster data collection or capture
SECONDARY using scanner

GEOGRAPHIC DATA
CAPTURE
2. Vector data capture
• Vectorization
Secondary geographic data capture
• Raster data collection/capture
using scanners
Three main reasons to scan hardcopy media
are:
• Documents are scanned to reduce wear and
tear, improve access, provide integrated
database storage, and to index them
geographically
• Maps, aerial photographs, and images
(scanned prior to vectorization) are scanned
and georeferenced so that it provide
geographic context for other data
Vector data collection/capture via
vectorization
• This involves digitizing vector objects from
maps and other geographic data sources.
Heads-up digitizing and vectorization
• Vectorization is the process of converting
raster data into vector data.
• The simplest way to create vectors from
raster layers is to digitize vector objects
manually straight off a computer screen
using a mouse or digitizing cursor.
Spatial Data Preparation and Editing
1. Data Cleaning
• Ensures that spatial data is accurate,
consistent, and free of errors.
• Errors in GIS data can arise due to
digitization mistakes, GPS
inaccuracies, or human errors.
• Common data cleaning tasks:
✓Removing duplicate entries
✓Correcting misspelled attributes
✓Standardizing field values
2. Georeferencing
• The process of assigning geographic
coordinates to raster images (e.g.,
scanned maps, aerial photographs).
• Uses control points (known reference
locations) to align the image with real-
world coordinates.
• Example: A historical paper map of a
city can be scanned and georeferenced
by selecting points (e.g., road
intersections) that match real-world
GPS coordinates.
3. Topology Corrections
• Ensures that spatial relationships
between features are maintained.
• Helps prevent issues such as:
✓Gaps between polygons (e.g., in land
parcel maps).
✓Overlapping polygons (e.g., conflicting
land ownership boundaries).
✓Dangling nodes in line features (e.g.,
disconnected roads).
ORGANIZING DATA FOR ANALYSIS

1. Data Structuring
•Organizing spatial and non-spatial data for
efficient analysis.
•Types of GIS data structures:
• Vector data: Points, lines, and polygons.
• Raster data: Grid-based data (e.g.,
satellite images).
• Attribute tables: Non-spatial information
linked to features.
Example: A municipal GIS system might store
roads as line features, buildings as polygons,
and streetlights as points, all linked to an
attribute database.
2. File Formats
•Different GIS formats serve different purposes.
•Common vector formats:
• Shapefile (.shp): Widely used, stores geometry and
attributes.
• KML (.kml): Used in Google Earth for 3D
visualization.
• GeoJSON (.geojson): Web-compatible format, used in
online mapping.
•Common raster formats:
• TIFF (.tif): High-resolution images, often used for
satellite data.
• JPEG (.jpg): Compressed image format, less precise
than TIFF.
Example: A government agency sharing GIS road data might
provide it in Shapefile (.shp) format for compatibility with
most GIS software.
3. Layer Management
•GIS projects involve multiple layers representing
different datasets.
•Organizing layers ensures clarity and efficiency.
•Layers can be categorized as:
• Base layers: Background data (e.g., satellite
imagery, topographic maps).
• Thematic layers: Analytical data (e.g., land
use, population density).
• Reference layers: Additional context (e.g.,
administrative boundaries).
Example: A flood risk analysis might use:
•Elevation data (raster layer).
•River network (vector layer).
•Urban areas (vector layer).
4. Metadata Documentation
•Provides information about the source, accuracy, and projection of GIS data.
•Helps users understand the reliability of the data.
•Key metadata components:
• Data source: Who created the data?
• Projection & Coordinate system: What spatial reference is used?
• Date of creation: When was the data last updated?
Example: A dataset of land cover types might include metadata stating it was derived from
Landsat 8 imagery, using UTM projection, with an accuracy of 90%.

You might also like