Spatial Databases
ENVE/CE 424/524
Definitions
Database an integrated set of data on a particular subject
Spatial database - database containing geographic data of a particular subject
for a particular area
Database Management System (DBMS) software to create, maintain and
access databases
System
Data load
Editing
Geographic
Visualization
Information
Mapping
System
Analysis
Database
Management
System
Data
Storage
Indexing
Security
Query
GIS: old and new
GIS used to be monolithic systems
all-in-one, proprietary applications that stored, queried, and visualized data
New systems follow more of a tool-box approach
modularized applications that interoperate
Who can benefit from spatial data management?
Army Commander: Has there been any significant enemy troop movement in
the past week?
Insurance Risk Manager: Which houses are most likely to be affected in the
next great flood on the Mississippi?
Medical Doctor: Based on this patients MRI, have we treated somebody with
a similar condition?
Molecular Biologist: Is the topology of the amino acid biosynthesis gene in
the genome found in any other sequence feature map in the database?
Astronomer: Find all blue galaxies within 2 arcmin of quasars.
Three classes of users for spatial databases
Major database managers: specialized products for enterprise management
GIS users: analysis of data
Internet user: more generalized requirements
Advantages of Databases over Files
Avoids redundancy and duplication
Reduces data maintenance costs
Applications are separated from the data
Applications persist over time
Support multiple concurrent applications
Better data sharing
Security and standards can be defined and enforced
Disadvantages of Databases over Files
Expense
Complexity
Performance especially complex data types
Integration with other systems can be difficult
Types of DBMS Model
Hierarchical
Network
Relational RDBMS
Object-oriented OODBMS
Object-relational - ORDBMS
Characteristics of DBMS
Data model support for multiple data types
e.g MS Access: Text, Memo, Number, Date/Time, Currency, AutoNumber,
Yes/No, OLE Object, Hyperlink, Lookup Wizard
Load data from files, databases and other applications
Index for rapid retrieval
Query language SQL
Security controlled access to data
Multi-level groups
Controlled update using a transaction manager
Backup and recovery
Relational DBMS
Data stored as tuples (tup-el), conceptualized as tables
Table data about a class of objects
Two-dimensional list (array)
Rows = objects
Columns = object states (properties, attributes)
Table
Column = property
Row = object
Table =
Object Class
Object
Classes with
Geometry
called
Feature
Classes
Relational DBMS
Most popular type of DBMS
Over 95% of data in DBMS is in RDBMS
Commercial systems
IBM DB2
Informix
Microsoft Access
Microsoft SQL Server
Oracle
Sybase
Spatial Database Example
Land parcel with boundary id: 1050
Relational Database Example
Four tables needed in the land parcel relational database
Relational database example #2
Relation Rules (Codd, 1970)
Only one value in each cell (intersection of row and
column)
All values in a column are about the same subject
Each row is unique
No significance in column sequence
No significance in row sequence
SQL
Structured (Standard) Query Language (pronounced SEQUEL)
Developed by IBM in 1970s
Now standard for accessing relational databases
Three types of usage
Stand alone queries
High level programming
Embedded in other applications (ArcGIS)
Types of SQL Statements
Data Definition Language (DDL)
Create, alter and delete data
CREATE TABLE, CREATE INDEX
Data Manipulation Language (DML)
Retrieve and manipulate data
SELECT, UPDATE, DELETE, INSERT
Data Control Languages (DCL)
Control security of data
GRANT, CREATE USER, DROP USER
Spatial Types OGC Simple Features
Data Model: A set of constructs for representing
objects and processes in a digital environment
Geometry
Point
SpatialReferenceSystem
Curve
Surface
LineString
Polygon
Line
LinearRing
Composed
Type
Relationship
GeometryCollection
MultiSurface
MultiCurve
MultiPolygon
MultiLineString
MultiPoint
Spatial Relations
Equals are the geometries the same?
Disjoint do the geometries share common point?
Intersects do the geometries intersect?
Touches do the geometries intersect at their boundaries?
Crosses do the geometries overlap?
Within is one geometry within another?
Contains does one geometry completely contain another?
Overlaps do the geometries overlap?
Relate are their intersections between the interior, boundary or
exterior of the geometries?
Contains Relation
Touches Relation
Spatial Methods
Distance determines shortest distance between any two points in two
geometries
Buffer returns a geometry that represents all the points whose distance
from the geometry is less than or equal to a user-defined distance
ConvexHull returns a geometry representing the small polygon that can
enclose another geometry without any concave areas
Intersection returns a geometry that contains just the points common to
both input geometries
Union returns a geometry that contains all the points in both input
geometries
Difference returns a geometry containing the points that are different
between the two geometries
SymDifference returns a geometry containing the points that are in
either of the input geometries, but not both
Convex Hull and Difference Methods
Convex Hull
Difference
Indexing
Used to locate rows quickly
Like a book index, it is a special representation of the content that
adds order and makes finding items faster
RDBMS use simple 1-d indexing
Spatial DBMS needs 2-d, hierarchical indexing
Grid
Quadtree
R-tree
Multi-level queries often used for performance (MBR)
Grid Index (multi-level)
- Overlay uniform grid
- Assign objects a grid id
Multi-level grids are used for
variable sized objects within
a database
Point and Region Quadtree Indexing
Based on recursive division of space.
Point Quadtree
Region Quadtree
R-tree
Use minimum bounding rectangle (MBR) or minimum bounding box (MBB)
Add a new object to the MBR that would expand the least to accommodate the object
Minimum Bounding Rectangle
Minimum
Bounding
Rectangle
Study
Area
Order Dependence of a Query
Query: Select all households within 3 km of a store that
have an income greater than $100,000
1. Select all households with an
income greater than $100,000;
from this selected set, select all
households within 3 km of a
store
2. Select all households within 3
km of a store; from this selected
set, select all households with
an income greater than
$100,000
Distributed Databases
www.midcarb.org
References
Longley et al., Geographic Information Systems and Science, 2001
Chapter 11
Guenther, Environmental Information Systems, 1998
Chapter 3
Final Few Weeks
Lecture: April 15, Metadata and Interoperability
Lab: April 17 (next Thursday), project/problem set work
Ill spend a few minutes with each of you to get an update on your
progress.
Article review due April 17
Lab: April 22, project lab session.
Lecture April 24, GIS in decision-making
Project Presentation: May 8