GIS-Data Structure and Attribute Data Model
GIS-Data Structure and Attribute Data Model
Introduction
what’s in the S?
– Systems: the technology
– Science: the concepts and theory
– Studies: the societal context
Many of our decisions depend on the details of our immediate surroundings and require
information about specific places on the Earth’s surface.
Such information is called geographical because it helps us to distinguish one place from
another and to make decisions for one place that are appropriate for that location.
The common ground between information processing and the many fields
using spatial analysis techniques. (Tomlinson, 1972)
1
A decision support system involving the integration of spatially referenced
data in a problem solving environment. (Cowen, 1988)
2
– File formats
A prerequisite for describing the real world by use of GIS is that the different types of
geographical information can be stored in a computer.
All the operations in a computer are based on the storage and handling of numbers.
This is why the data stored in a computer are known as digital data.
In GIS there is a need to store graphical figures, images, numerical values, and plain
text.
All these forms of data must thus be able to be converted to digital representation.
Vector Model
The fundamental concept of vector GIS is that all geographic features in the real world
can be represented either as:
Points or Dots (nodes): Points are the fundamental and simplest form of geographical
objects and are zero-dimensional because they have no extension. Each point is
represented by a coordinate pair. Example : trees, poles, tube wells, earthquakes.
Lines (arcs) : Lines linked together with line segments. A line has two points as a
boundary; a start point and an end point. Lines are one-dimensional, as they stretch in
only one direction. Mathematically, a vector is straight line having both magnitude and
direction. streams, streets, sewers, pipe line, electrical line.
Areas (polygons): An area is represented by a single line that encloses a space, thus
forming a closed polygon. The surrounding line, called a ring, has to start and end at
the same point in order for the area to be closed and defined. Areas are two-
dimensional because they stretch in two dimensions. Example: land parcels, cities,
counties, forest, rock type, water body.
Spaghetti Model
Spaghetti data are a collection of points and lines with no real connection.
What appears as a long, continuous line on the map or in the terrain may consist of
several line segments that are to be found in odd places in the data file.
3
There are no specific points that designate where lines might cross, nor are there any
details of logical relationships between objects.
Topology Model
The topology model is one in which the connections and relationships between objects
are described independent of their coordinates.
The topology model overcomes the major weakness of the spaghetti model, which
lacks the relationships requisite to many GIS manipulations and presentation.
A node can be a point where two lines intersect, an endpoint on a line, or a given point
on a line. For example intersection of two roads.
A link is a segment of a line between two nodes. Links have a start node and an end
node and therefore have a direction in a topology model. Several links can share a
node, and a collection of such links and nodes is known as a network.
Raster Model
Area is covered by grid with (usually) equal-sized, square cells.
The raster model represents reality through selected surfaces arranged in a regular
pattern.
Reality is thus generalized in terms of uniform, regular cells, which are usually
rectangular or square but may be triangular or hexagonal.
The raster model is in many ways a mathematical model, as represented by the regular
cell pattern.
Attributes are recorded by assigning each cell a single value based on the majority
feature (attribute) in the cell, such as land use type.
4
Image data is a special case of raster data in which the “attribute” is a reflectance value
from the geomagnetic spectrum
Cells in image data often called pixels (picture elements)
Raster models are created by assigning real-world values to pixels.
The assigned values comprise the attributes of the objects. Values are assigned
to all the pixels in a raster.
Real World
Raster Representation
0 1 2 3 4 5 6 7 8 9
Vector Representation
0 R T
1 R T
2 H R
3 R
4 R R point
5 R
6
7
R
R
T
T
T
T
H line
8 R
9 R
polygon
Vector Raster
Point = Position, no area Point = 1 cell
5
Polygon = Area and Perimeter Polygon = Group of contiguous cells
joined at edges or corners
Topology (geometrical relationships between spatial objects viz. points, lines, areas)
can be completely described, including network linkages.
A separate data model is used to store and maintain attribute data for GIS software.
These data models may exist internally within the GIS software, or may be reflected in
external commercial Database Management Software (DBMS). A variety of different data
models exist for the storage and management of attribute data. The most common are:
• Tabular
• Hierarchial
• Network
• Relational
• Object Oriented
Tabular : The simple tabular model stores attribute data as sequential data files with fixed
formats (or comma delimited for ASCII data), for the location of attribute values in a
predefined record structure. This type of data model is outdated in the GIS arena. It lacks
any method of checking data integrity, as well as being inefficient with respect to data
storage, e.g. limited indexing capability for attributes or records, etc.
6
Hierarchial : The hierarchical database organizes data in a tree structure. Data is
structured downward in a hierarchy of tables.
Hierarchial DBMS have not gained any noticeable acceptance for use within GIS.
Network : The network database organizes data in a network or plex structure. Any
column in a plex structure can be linked to any other. Network DBMS have not found
much more acceptance in GIS than the hierarchical DBMS.
Relational : The relational database organizes data in tables. Each table, is identified by a
unique table name, and is organized by rows and columns. Each column within a table also
has a unique name.
Columns store the values for a specific attribute. Rows represent one record in the table. In
a GIS each row is usually linked to a separate spatial feature. Accordingly, each row would
be comprised of several columns, each column containing a specific value for that
geographic feature.
The ability to join tables through use of a common column is the essence of the relational
model. The relational database model is the most widely accepted for managing the
attributes of geographic data.
Object Oriented : The object-oriented database model manages data through objects. An
object is a collection of data elements and operations that together are considered a single
entity. The object-oriented database is a relatively new model. To date, only a few GIS
packages are promoting the use of this attribute data model.