Chapter Three: 3. Data Management and Processing Systems
Chapter Three: 3. Data Management and Processing Systems
•A DBMS allows organizations to define • The cost of acquiring DBMS software can
and enforce security and standards for data be quite high.
and data access.
• A DBMS adds complexity to the problem
•DBMS are better suited to managing large of managing data, especially in small
numbers of users simultaneously working projects.
with vast amounts of data.
• Regular maintenance and administration
•A DBMS can be instructed to guard over are required.
some levels of data correction
Spatial Database/Geodatabase
• It is a relational database that contains spatial and non-spatial objects.
Departmental
Original Desktop Improved Desktop Distributed data or Large capacity and
Functionality format format project-level use
projects or small
user base
organizations
2 GB per
1 TB per object; Limited by relational
geodatabase; 10 GB per database 10 GB per database
Storage limit effective limit
configurable to
server server
database and
256 TB hardware
~500 MB
Three concurrent
One editor per Ten concurrent users,
User limit database
One editor per object users, one of which
all can edit
Unlimited
can edit
Platform Windows Any Windows Windows Any
ArcGIS for Desktop ArcGIS for Desktop ArcGIS for Desktop ArcGIS for Server ArcGIS for Server
Licensing - Any - Any - Standard - Workgroup - Enterprise
- Advanced
➢ Personal Geodatabases(single ➢Multi-user Geodatabases(ArcSDE
user geodatabase): geodatabases)
✓ used in ArcGIS since their initial release
✓Stored in a relational database using Oracle, Microsoft
✓ use the Microsoft Access data file structure (the SQL Server, IBM DB2, IBM Informix, or PostgreSQL.
.mdb file) and Jet Engine
✓Require ArcSDE and a DBMS ( Data Base Management
✓ Can be viewed by multiple users but edited by Systems).
only one user at a time.
✓ Have a maximum size of 2 Gigabytes (GB)or less. ✓Can be read and edited by multiple users at the same
time.
✓ Do not store raster data.
✓Can store raster data
✓ only supported on the Microsoft Windows
operating system. ✓Better storage capacity than single user geodatabase
• File geodatabase
• It is a relational database storage format. It’s a far more complex data structure than the
shapefile and consists of a .gdb folder housing dozens of files.
• It store multiple feature classes and enabling topological definitions (i.e. allowing the
user to define rules that govern the way different feature classes relate to one another
• It provide a portable geodatabase that works across operating systems.
• Scale up to handle very large datasets.
• Datasets that can scale beyond 500 GB per file with very fast performance.
• It allow users to compress vector data to a read-only format to reduce storage
requirements even further.
FILE GEODATABASE
TERMINOLOGY IN SPATIAL DATABASES
1. Geographic Objects:
➢A geographic object corresponds to an entity of
the real world and has two components.
✓ Description. The object is described by a set of
descriptive attributes.
3.Maps
➢When a theme is displayed on paper or on-screen,
what the user sees is a map as it is commonly
displayed, with colours, a particular scale, a legend,
and so on.
Entity : Road in real world Object: The road as a line Symbol: A black or red line on a map
feature in a GIS database, that represents the road, possibly with
with attributes like name and labels.
road type.
Element/Components of Spatial Database
1. Entity
An entity is "a phenomenon of interest in reality that is not further subdivided into
phenomena of the same kind"
•e.g. a city could be considered an entity and subdivided into component parts but these
parts would not be called cities, they would be districts, neighborhoods or the like
2. Spatial Object:
An object is "a digital representation of all or part of an entity" .
•The method of digital representation of a phenomenon varies according to scale, purpose
and other factors.
•e.g. a city could be represented geographically as a point if the area under consideration
were continental in scale the same city could be geographically represented as an area if
we are dealing with a geographic database for a state or a county.
TYPES OF DBMS
DBMS can be classified according to the way they store and manipulate
data.
Three main types of DBMS are available to GIS users today:
1. Relational (RDBMS)
2. Object (ODBMS)
3. Object-relational (ORDBMS)
1. A relational database comprises a set of tables, each a two-
dimensional list (or array) of records containing attributes about the
objects under study.
2. Object database management systems (ODBMS) were initially
designed to address several of the weaknesses of RDBMS. These
include the inability to store complete objects directly in the
database.
3. Object-relational DBMS (ORDBMS): This is largely because of
the massive installed base of RDBMS and the fact that RDBMS
vendors have now added many of the important ODBMS
capabilities to their standard RDBMS software systems to create
hybrid object-relational DBMS (ORDBMS).
Spatial Database Design
➢Database design involves
identifying important phenomena and
choosing the appropriate data
representation for storage.
➢The overall goal or target of
database design is to maintain data
consistency/integrity, reduces data
redundancy, increase system
performance, maintain maximum user
flexibility, and create a useable system
Spatial Database Design
However, database design is highly influenced by:
✓ Applications,
✓Data format and size,
✓Data maintenance and update,
✓Hardware/software,
✓Number and sophistication of users,
✓Schedule and budget of the project, management approach
Steps in Spatial database design
➢Database design involves the following three major
steps conceptual, logical, and physical models.
1. Conceptual Model Conceptual Model
➢ Model the user’s view.
This involves tasks such as identifying organizational 1. User View
functions
➢ Define objects and their relationships. 2. Objects and
Relationship
The object types (classes) and functions can be
specified. 3. Geographic
➢ Select geographic representation. Representation
Choosing the types of geographic representation
(discrete object – point, line, and polygon – or field).
2. Logical Model
➢Match to geographic database types.
This involves matching the object types
to be studied to specific data types Logical Model
supported by the GIS that will be used to 4. Geographic
create and maintain the database. Database Types
5. Geographic
➢Organize geographic database structure. Database Structure
➢ This includes tasks such as defining
topological associations, specifying rules
and relationships, and assigning
coordinate systems
3. Physical Model
➢Define database schema. The final stage is definition of
the actual physical database schema that will hold the
database data values
Physical Model
➢A schema is a compact graphical representation of the
conceptual model, the entities and the relationship among 6. Database Schema
them.
Conceptual Model
Logical Model
1. User View Physical Model
4. Geographic
Database Types
2. Objects and 6. Database Schema
Relationship 5. Geographic
Database Structure
3. Geographic
Representation
Chapter Four
➢In reality, the earth resembles more closely then the sphere a
figure called an ellipsoid or spheroid.
• Equators
• Prime Meridians
• Parallels
• Meridians
PARALLELS(A) and MERIDIANS(B)
• Parallels run parallel to each other in an east
west direction around the Earth.
-90
X axis = Equator
Y axis = Prime Meridian
Why referencing and projection
❖The purpose of reference system and
projections are:
✓Creating spatial data (collecting GPS
data)
✓Import into GIS and overlay with
other layers
✓Acquiring spatial data from other
sources
✓Display your GPS data using maps
✓Describes where features are located
in the real world
Geographic Coordinate Systems(GCS)
•Geographic coordinate systems consist angles of latitude, which varies from north to south,
and longitude, which varies from east to west. A point is referenced by its longitude and
latitude values.
•Consider earth as a 3-Dimensional Spherical Surface (ellipsoid) spherical model of the
earth , any location on the earth surface is defined by an angular unit of measure like
degrees, prime meridian, and a datum.
•The position of any point is defined by the intersection of both imaginary lines.
•They are lines of equal/constant latitude and longitude.
•The line of latitude midway between the poles is called the Equator. It defines the line of
zero latitude.
Geographic Coordinate Systems(GCS)
• Reference system based on a 3D spherical surface
For example:
• The latitude 41° 27 minutes (‘) and 41 seconds (‘‘) north =(41° 27´ 41´´N)
• To transfer geographic coordinates into decimal degrees you can use the following
calculation: Decimal degrees = Degrees + Minutes/60 + Seconds/3600
Projected or Cartesian Coordinate System
•Reference systems, called rectangular coordinates or plane coordinates, allow us to locate objects
correctly on flat maps (Two-dimensional maps projected from reference globe).
•It is based on a sphere or spheroid geographic coordinate system, but it uses linear units of measure
for coordinates, so that calculations of distance and area are easily done in terms of the same units.
•Has constant lengths, angles, and areas across the two dimensions
•The basic rectangular coordinate system consists of two lines:
•Abscissa (X-coordinates):is a horizontal line that contains equally spaced numbers starting from 0,
called the origin, and extending as far as we wish to measure distance in either of two directions.
•Ordinate (Y-coordinates): The ordinate allows us to move vertically from the same point of origin in
a positive or negative Y direction.
• Units generally in meters or feet
▪ Shape
▪ Area
▪ Distance
▪ Direction
•One map projection might be used for large-scale data
in a limited area, while another is used for a small-
scale map of the world
Map Projections
• Map Projection - the transformation of a curved earth to a flat map (3D to 2D)
• Parallels and meridians used as a base on which to draw a map on a flat surface
• transforms a position on the Earth’s surface identified by latitudes and longitudes into
apposition in Cartesian coordinates (x, y).
• All map projections are attempts to portray the surface of the earth on a flat surface
• Map projections can be classified in to three general families: Cylindrical, Conical, and
Azimuthal or Planar.
Digital Analogue
easy to update whole map to be remade
easy and quick transfer (e.g. via internet) slow transfer (e.g. via post)
storage space required is relatively small large storage space required (e.g.
(digital devices) traditional map libraries)
easy to maintain paper maps disintegrate over time
difficult and inaccurate to analyze
easy automated analysis
(e.g. to measure areas and distances)
Spatial Data classification
❑Text and Imaginary data
▪ Text data includes
• Reports
• Documents and records
• Statistics and census
• Results of investigation and experiments
▪ Imaginary data includes
• Maps (digital map and analogue map)
• Photos
• RS (remote sensing, satellite images and
aerial photographs)
Data sources
• Primary data sources are those collected in
digital format specifically for use in a GIS
project.
GEOGRAPHIC DATA
CAPTURE
2. Vector data capture
• Vectorization
Secondary geographic data capture
• Raster data collection/capture
using scanners
Three main reasons to scan hardcopy media
are:
• Documents are scanned to reduce wear and
tear, improve access, provide integrated
database storage, and to index them
geographically
• Maps, aerial photographs, and images
(scanned prior to vectorization) are scanned
and georeferenced so that it provide
geographic context for other data
Vector data collection/capture via
vectorization
• This involves digitizing vector objects from
maps and other geographic data sources.
Heads-up digitizing and vectorization
• Vectorization is the process of converting
raster data into vector data.
• The simplest way to create vectors from
raster layers is to digitize vector objects
manually straight off a computer screen
using a mouse or digitizing cursor.
Spatial Data Preparation and Editing
1. Data Cleaning
• Ensures that spatial data is accurate,
consistent, and free of errors.
• Errors in GIS data can arise due to
digitization mistakes, GPS
inaccuracies, or human errors.
• Common data cleaning tasks:
✓Removing duplicate entries
✓Correcting misspelled attributes
✓Standardizing field values
2. Georeferencing
• The process of assigning geographic
coordinates to raster images (e.g.,
scanned maps, aerial photographs).
• Uses control points (known reference
locations) to align the image with real-
world coordinates.
• Example: A historical paper map of a
city can be scanned and georeferenced
by selecting points (e.g., road
intersections) that match real-world
GPS coordinates.
3. Topology Corrections
• Ensures that spatial relationships
between features are maintained.
• Helps prevent issues such as:
✓Gaps between polygons (e.g., in land
parcel maps).
✓Overlapping polygons (e.g., conflicting
land ownership boundaries).
✓Dangling nodes in line features (e.g.,
disconnected roads).
ORGANIZING DATA FOR ANALYSIS
1. Data Structuring
•Organizing spatial and non-spatial data for
efficient analysis.
•Types of GIS data structures:
• Vector data: Points, lines, and polygons.
• Raster data: Grid-based data (e.g.,
satellite images).
• Attribute tables: Non-spatial information
linked to features.
Example: A municipal GIS system might store
roads as line features, buildings as polygons,
and streetlights as points, all linked to an
attribute database.
2. File Formats
•Different GIS formats serve different purposes.
•Common vector formats:
• Shapefile (.shp): Widely used, stores geometry and
attributes.
• KML (.kml): Used in Google Earth for 3D
visualization.
• GeoJSON (.geojson): Web-compatible format, used in
online mapping.
•Common raster formats:
• TIFF (.tif): High-resolution images, often used for
satellite data.
• JPEG (.jpg): Compressed image format, less precise
than TIFF.
Example: A government agency sharing GIS road data might
provide it in Shapefile (.shp) format for compatibility with
most GIS software.
3. Layer Management
•GIS projects involve multiple layers representing
different datasets.
•Organizing layers ensures clarity and efficiency.
•Layers can be categorized as:
• Base layers: Background data (e.g., satellite
imagery, topographic maps).
• Thematic layers: Analytical data (e.g., land
use, population density).
• Reference layers: Additional context (e.g.,
administrative boundaries).
Example: A flood risk analysis might use:
•Elevation data (raster layer).
•River network (vector layer).
•Urban areas (vector layer).
4. Metadata Documentation
•Provides information about the source, accuracy, and projection of GIS data.
•Helps users understand the reliability of the data.
•Key metadata components:
• Data source: Who created the data?
• Projection & Coordinate system: What spatial reference is used?
• Date of creation: When was the data last updated?
Example: A dataset of land cover types might include metadata stating it was derived from
Landsat 8 imagery, using UTM projection, with an accuracy of 90%.