Unit 5 Gis Data Models and Spatial Data Structure
Unit 5 Gis Data Models and Spatial Data Structure
5.1 INTRODUCTION
You have read about the components, history and organisational aspects of
GIS in the previous unit. The ability of GIS to represent geospatial data
differentiates it from other information systems. Geospatial data is a
representative of both spatial and attribute data. For example, to define any
spatial feature like river, we need its location (where it is - latitude and
longitude) and characteristics like name, length, speed and direction of flow.
The location information of the river constitutes the spatial data whereas the
characteristics information represents attribute data. GIS is used to represent
spatial features on the surface of the Earth as map features on a plane
surface. Data models are used to represent spatial features in a GIS
environment. There are two types of data models namely, vector and raster data
models.
In this unit, we will discuss the basic concepts of raster and vector data models
in GIS along with their advantages and disadvantages. You will get an idea of
spatial data structures. As Database Management Systems (DBMS) represent the
dominant technology in GIS, we would like to briefly introduce you to the
fundamentals of DBMS with reference to GIS.
Objectives
After studying this unit, you will be able to:
• describe raster and vector GIS data models;
• discuss advantages and disadvantages of raster and vector data models;
• explain topology, topological and non-topological data structures; and
• define database management system in GIS. 19
Fundamentals of Geographic
Information System 5.2 GIS DATA MODELS
Let us first discuss the concept of data model.
Data model is, basically, a conceptual representation of the data structures in a
database. Whereas data structures comprise objects of data, relationships
between data objects and rules which regulate operations on the objects. In
Models are the abstract
representation of the real other words, data model represents a set of rules or guidelines which are used
world in various forms like to convert the real world features into digitally and logically represented spatial
pictorial/graphical/ objects. In GIS, data models comprise the rules which are essential to define
sculpture. what is in operational GIS and its supporting system. Data model is the core
of any GIS which gives a set of constructs for describing and representing
selected aspects of the real world in a computer (Longley et al., 2005).
You have already read that in GIS data models, all real world features are
represented as points, lines or arcs and polygons. Data modellers often use
multiple models during the representation of real world in a GIS environment
(Fig. 5.1). First is reality, which consists of real world phenomena such as
natural and man-made features. Other three stages are conceptual, logical and
physical models. The conceptual model is the process of developing a graphical
representation from the real world. It determines the aspects of the real world
to include and exclude from the model and the level of detail to model each
aspect. It is human-oriented and partially structured. Logical model is the
representation of reality in the form of diagrams and lists. It has an
implementation-oriented approach. Physical model presents the actual
implementation in a GIS environment and comprises tables which are stored as
databases. Physical model has specific implementation approach.
Reality (real
world features)
Human-oriented
aspects of real
↓ world
Conceptual
model
Increasing
abstraction ↓
Logical
model
↓
Computer-oriented
aspects of real world
Physical
model
↓ ↓
Object-based Field-based
model model
Conversion
Vector model Raster model
Spatial
database
Fig. 5.2: Illustration representing an outline model (Source: Lo and Yeung, 2009)
Object-Based Model: The object is a spatial feature and has some
characteristics like spatial boundary, application relevant and feature
description (attributes). Spatial objects represent discrete features with well-
defined or identifiable boundaries, for example, buildings, parks, forest lands,
geomorphological boundaries, soil types, etc. In this model, data can be
obtained by field surveying methods (chain-tape, theodolite and total station
surveying, GPS/DGPS survey) or laboratory methods (aerial photo
interpretation, remote sensing image analysis and onscreen digitisation).
Depending on the nature of the spatial objects we may represent them as
graphical elements of points, lines and polygons.
Field-Based Model: Spatial phenomena are real world features that vary
continuously over space with no specific boundary. Data for spatial
phenomena may be organised as fields which are obtained by direct or
indirect sources. Source of direct data is from aerial photos, remote sensing
imagery, scanning of hard copy maps, and field investigations made at
The Digital Elevation
selected sample locations. We can obtain or generate the data by using
Model (DEM) consists of
mathematical functions such as interpolation, sampling or reclassification an array of uniformly
from selected sample locations. This approach comes under indirect data source. spaced elevation data. A
For example, Digital Elevation Model (DEM) can be generated from topographic DEM is point based but it
data such as spot heights and contours that are usually obtained by indirect can be easily converted to
measurements. raster data by placing each
elevation point at the
Spatial database may be organised as either object-based model or the field- centre of a cell.
based model. In object-based databases, the spatial units are discrete objects 21
Fundamentals of Geographic which can be obtained from field-based data by means of object recognition and
Information System
mathematical interpolation. In the object-based model, spatial data is mostly
represented in the form of coordinate’s lists (i.e. vector lines) and generally called
as the vector data model. When a spatial phenomena database is structured on
the field-based model in the form of grid of square or rectangular cells then the
representation is generally called as the raster data model. Geospatial database
possess two distinct components such as locations and attributes. Geographical
features in the real world are very difficult to capture and may requires large
scale database. GIS can organise reality through the data models. Each model
tends to fit certain types of data and applications better than others. All spatial
data models fall into two basic categories: raster and vector.
Let us now discuss in brief about these two types of models.
5.2.1 Raster Data Models
You have been The raster data model is composed of a regular grid of cells in specific sequence
introduced to raster data and each cell within a grid holds data. The conventional sequence is row by row
in Unit 4 of MGY-001 which may start from the top left corner. In this model, basic building block is
and Unit 10 of MGY-002.
the cell. The representation of the geographic feature in this model is used by
coordinate, and every location corresponds to a cell. Each cell contains a single
value and is independently addressed with the value of an attribute. One set of
cells and associated value is a layer. Cells are arranged in layers. A data set can
be composed of many layers covering the same geographical areas e.g., water,
paddy, forest, cashew (Fig. 5.3). Points, lines and polygons representation in grid
format is presented in Fig. 5.4. The raster model, which is most often used to
represent continuously varying phenomena such as elevation or climate, is also
used to store pictures or satellite images and plane based images. A raster image
comprises a collection of grid cells rather like a scanned map or photo.
Column
1 2 3 4 5 6 7 8 9 10 11
1
2
Resolution
3
4
5
6
7
R ow
8
9
10
11
(a)
Count
(b)
Fig. 5.3: Illustration of raster data; (a) raster grid matrix with their cell location and
22 coordinates, and (b) raster grid and its attribute table
GIS Data Models
and Spatial Data
Structure
(a) (b)
Fig. 5.4: Representation of raster gird format; (a) point (cell), line (sequence of cells), and
polygon (zone of cells) features and (b) no data cells (black in colour)
Point
Line
Polygon
Fig. 5.5: Vector model represents point, line and polygon features
23
Fundamentals of Geographic Points, lines and polygons are features which can be designated as a feature
Information System
class in a geospatial database. Each feature class pertains to a particular theme
such as habitation, transportation, forest, etc. Feature classes can be structured
as layers or themes in the database (Fig. 5.6). Feature class may be linked to an
attribute table. Every individual geographic feature corresponds to a record (row)
in the attribute table (Fig. 5.6).
The simplest vector data model stores and organises the data without establishing
relationships among the geographic features are generally called as spaghetti
model. In this model, lines in the database overlap but do not intersect, just like
spaghetti on a plate. The polygon features are defined by lines which do not
have any concept of start and end node or intersection node. However, the
polygons are hatched or coloured manually to represent something. There is no
data attached to it and, therefore, no data analysis is possible in the spaghetti
model (Fig. 5.7) (Rolf, 2001).
Fig. 5.7: Vector spaghetti data model; (a) Spaghetti data, (b) cleaned spaghetti data
and (c) polygons in spaghetti data
We have studied about the raster and vector data models. You will now learn, in
the following subsections, comparison of advantages and disadvantages of these
two data models.
You have learnt about the 5.2.3 Comparison of Raster and Vector Data Models
basic introduction to the
comparison of Raster and As you know raster and vector data models are important in a GIS. Each one
Vector data in Unit 4 of
MGY-001.
has its own strength. A comparison between these two types of data models is
shown in Table 5.1.
24
Table 5.1: Comparison between raster and vector data models GIS Data Models
and Spatial Data
Structure
Properties Raster Vector
Row 4 1,8
Row 5 4,7
Row 6 4,7
Row 7 7,7
Row 8 7,7
Row 9 8,10
Row 10 8,10
b1 b2
Arc file
Arc From To PL PR N1
N1x N1
N1y N2x N2y
1 N1 N2 A B x y x y
B
Point list
Pt. ID Coordinates
(x, y)
A
A (2, 4)
C
B (4, 6)
C (9, 2)
Arc-node-coordinate list
A 1 2
1
B 2 3
A
C 3 4
4 D 2
D 4 2
B Arc ID Coordinates
(x, y)
A (7,8) (7,6)
C 3
B (7,6)(7,2)
C (7,2)(5,2)(3,3)(2,6)
(2,6) (7,6)
D
31
Fundamentals of Geographic
Information System
A,B,G,H,I a
A (3,8)(9,7)
C,D,E,F,G,H b
B (9,7)(9,4) I c
G (3,4)(9,4) F o b
G o b
H (3,4)(9,8) H o b
I a c
I (5,7)(7,7)(7,5)(5,5)
Vector data structure that is common among GIS software is the Computer
Aided Design (CAD) data structure. Drawing Exchange Format (DXF) is
used in the CAD package (e.g., AutoCAD) for transferring of the data files.
DXF does not support topology and arrange the data as individual layers.
This structure consists of listing elements, not features, defined by strings of
vertices, to define geographic features, e.g., points, lines, or areas. There is
considerable redundancy with this data model since the boundary segment
32 between two polygons will be stored twice, once for each feature. This
format allows user to draw each layer by using different line symbols, colours GIS Data Models
and Spatial Data
and texts. In this structure, polygons are independent and difficult to answer Structure
about the adjacency of features. The CAD vector model lacks the definition
of spatial relationships between features that is defined by the topological data
model.
Shape file comprises points
Since 1990s almost all commercial GIS packages such as ArcGIS, MapInfo, (a pair of x, y coordinates),
lines (series of points),
Geomedia have adopted non-topological data structure. Shape file (.shp) is a
polygons (series of lines).
standard non-topological data format used in GIS packages. In ArcInfo There are no files to
coverage, the geometry of shape file is stored into two extension types such as describe the spatial
.shp and .shx. Shape file (.shp) stores the feature geometry and .shx file relationship between
maintains the spatial index of the feature geometry. The advantage of non- geometric objects and
polygon boundaries have
topological data structure, i.e. shape file, lies in quick display on the system than
duplicate in shape file.
the topological data. Many software packages such as ArcGIS, MapInfo uses
the .shp file format.
Check Your Progress II Spend
5 mins
1. What is topology? Explain topological and non-topological data structures.
..................................................................................................................
..................................................................................................................
..................................................................................................................
..................................................................................................................
..................................................................................................................
5.5 ACTIVITY
You have read about the raster and vector models and how they work
theoretically. Now you should create spatial database using GIS software in
different formats and understand the differences.
5.6 SUMMARY
You have learnt the following in this unit:
• Real world features such as temples, parks, roads, railways, crop land, and
forest land are represented as point, line/polyline and polygon. Spatial
information of features or objects can be stored in a GIS using vector or
raster models. Spatial database of real world features need to be translated
into simplified representations which can be stored and updated in a system.
• Two data models, namely, vector data model which is used to symbolise
discrete features, and the raster data model, which is most often used to
represent continuously varying phenomena currently dominate the commercial
GIS software.
• Main advantage of vector model is easy access and complex analysis, while
raster model is useful for overlaying and spatial analysis.
• The raster data structure represents the information in the form of grid cells
or pixels which stands for picture element. Important raster data structures
viz. cell-by-cell encoding, run length encoding, and quadtree give an idea to
store the raster data information.
• The data structures are mainly topological, i.e. TIGER, coverage and non-
topological data structures under vector models.
Spend
30 mins 5.7 UNIT END QUESTIONS
1) What are geospatial models? Explain raster and vector GIS models.
2) Explain the advantage and disadvantages of raster and vector GIS models.
Compare both the models.
5.10 ANSWERS
Check Your Progress I
1) Raster data model represents the real world in a regular set of cells in grid
pattern whereas vector data model uses sets of coordinates and feature
characteristic (attribute) data to define discrete objects. We can guess
temples, post offices, hospitals, wells, buildings on large scale (i.e. 1:5000 or
less) as points, and line or polyline as railway, road, drainage and polygons
could be forest boundaries, rock type, etc.
Check Your Progress II
1) Topology is the spatial relations among geographical features, e.g., point,
line/polyline and polygon in GIS. It determines and describes the
relationships between connecting or adjacent features. There are two
types of vector data structures such as topological e.g., TIGER,
coverage and nontopological e.g., shape file.
Unit End Questions
1) Geospatial data models provide a method of representing geospatial
data in digital form. GIS models such as raster or vector are approaches for
storing the locational information of spatial features in a database. Raster
models use a grid-cell data structure to represent an object whereas vector
model implies the use of directional lines to characterise a geographic
feature.
2) Both the raster and vector models for storing geospatial data have unique
advantages and disadvantages. Raster models have simple data structure and
require a large space for data storage, while vector models are complex and
occupy small space.
35
Fundamentals of Geographic 3) Data structures are a method of storing data in a systematic way. There are
Information System
mainly two data structures used in GIS, namely, raster and vector. In raster
structure mainly three data structures, i.e. cell-by-cell encoding, run-length
encoding and quad tree while in vector, topological and non-topological data
structures are used to store the spatial data.
36