0% found this document useful (0 votes)
36 views37 pages

Unit - 5.2 - Database Management

Unit 5.2 covers database management focusing on types of data, specifically spatial and non-spatial data, and their respective merits and demerits. It discusses data input and output methods, various software modules including ArcGIS and QGIS, and the differences between raster and vector data models. Additionally, it explains the structure and topology of vector data, raster data encoding techniques, and the advantages and disadvantages of different data models.

Uploaded by

gomethaganmusa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views37 pages

Unit - 5.2 - Database Management

Unit 5.2 covers database management focusing on types of data, specifically spatial and non-spatial data, and their respective merits and demerits. It discusses data input and output methods, various software modules including ArcGIS and QGIS, and the differences between raster and vector data models. Additionally, it explains the structure and topology of vector data, raster data encoding techniques, and the advantages and disadvantages of different data models.

Uploaded by

gomethaganmusa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Unit 5.

2 - Database Management
• Introduction – Types of data
• Spatial data – Non spatial data
• Merits and Demerits of Raster Data
• Merits and Demerits of Vector data
• Data input – Methods Digitization, scanning,
keyboard entry
• Data output – Methods
• Software Modules Arc GIS, Arc Info, Arc
Toolbox, Arc Edit, Arc map, Arc Catalog
• QGIS and other open source software's
Data and Information
1. Data - constitute the building blocks of information
Data is of little use unless it is transformed into
information
2. Information - produced by processing the data
Information is an answer to a question based on
raw data
GIS can transform data into information

Data Computer Information GIS Knowledge


Types of data
• Spatial data
✓ Describes the absolute and relative location of geographic features
✓ Organized into layers
✓ Represented as vector (points, lines, polygons), raster (cells) and
TIN
• Non spatial data
✓ Describes characteristics of the spatial features
✓ Quantitative and/or qualitative
✓ Attribute data is often referred to as tabular data
• Raster data is made up of pixels (or cells), and
each pixel has an associated value. Simplifying
slightly, a digital photograph is an example of a
raster dataset where each pixel value corresponds
to a particular colour. In GIS, the pixel values may
represent elevation above sea level, or chemical
concentrations, or rainfall etc. The key point is that
all of this data is represented as a grid of (usually
square) cells.

• Vector data consists of individual points, which


(for 2D data) are stored as pairs of (x, y) co-
ordinates. The points may be joined in a particular
order to create lines, or joined into closed rings to
create polygons, but all vector data fundamentally
consists of lists of co-ordinates that define vertices,
together with rules to determine whether and how
those vertices are joined.
Raster vs Vector
RASTER VECTOR

Comprised of pixels, arranged to form an Comprised of paths, dictated by mathematical


image formulas

Constrained by resolution and dimensions Infinitely scalable

Capable of rich, complex color blends Difficult to blend colors without rasterizing

Large file sizes (but can be compressed) Small file sizes

File types include .jpg, .gif, .png, .tif,


File types include .ai, .cdr, .svg; plus .eps and
.bmp, .psd; plus .eps and .pdf when
.pdf when created by vector programs
created by raster programs

Raster software includes Photoshop and Vector software includes Illustrator,


GIMP CorelDraw, and InkScape

Perfect for “painting” Perfect for “drawing”

Capable of detailed editing Less detailed, but offers precise paths


Spatial representation
DISCRETE ENTITIES CONTINUOUS FIELDS

Located using coordinates Variation of an attribute over space as


continuous field

Has clear boundary No physical boundary can be observed

Eg: Buildings, roads, land parcels etc. Eg: Temperature, pressure, elevation etc.
GIS data models
Data model - Conceptual models of real world
Types
• Raster data model – field conceptual model
A representation of the world as a surface divided into a regular grid of cells.
Raster models are useful for storing data that varies continuously, as in an aerial
photograph, a satellite image, a surface of chemical concentrations, or an
elevation surface. Geographic space is represented by array of cells or pixels,
arranged in rows and columns. Each cell has a value (integer, floating point,
alphanumeric) that represents information.

• Vector data model – A representation of the world using points, lines, and
polygons. Vector models are useful for storing data that has discrete
boundaries, such as country borders, land parcels, and streets. geographic
phenomena represented in point, line, polygon and stored as 2D (x,y)
coordinate
Vector data model
• Point: A location depicted by a single set of (x, y) coordinates at the scale of
abstraction. The wells in a village, electricity poles in a town and cities in the world
map are the examples of spatial features described by points.
• Lines: Ordered sets of (x, y) coordinate pairs arranged to form a linear feature. The
curves in a linear feature are generated by increasing the density of points/vertices.
The roads, rails and telephone cables are the examples of the spatial features
described by lines.
• Polygons: The set of (x, y) coordinate pairs enclosing a homogeneous area. The land
parcels, agricultural farms and water bodies are the examples of the spatial features
described by polygons.
Raster data model
• Point: A point can be represented by a single pixel in raster model.
• Line: A line is a chain of spatially connected cells with the same
value.
• Polygon: A water body in raster data is represented as a set of
contiguous pixels having same value that represents a homogeneous
area.
SOURCES OF DATA
DATA INPUT METHODS
•Hard copy maps

•Aerial photographs

•Remotely-sensed imagery

•Point data samples from surveys; and

•Existing digital data files.


Data input methods for GIS
DIGITIZATION
• Manual digitizing;
• Automatic scanning;
• Entry of coordinates using coordinate geometry;
and the
•Conversion of existing digital data.
Data output methods for GIS
Software modules
• Commercial GIS softwares
– ArcView 3.X
– Intergraph GeoMedia
– Maptitude Mapping Software
– MapInfo
– Manifold System

• Open source GIS softwares


– GRASS GIS
– gvSIG
– ILWIS (Integrated Land and Water Information System)
– JUMP GIS/ OpenJUMP ((Open) Java Unified Mapping Platform)
– MapWindow GIS
– QGIS (previously known as Quantum GIS)
– SAGA GIS (System for Automated Geoscientific Analysis)
– uDig
Contd.,
• Web map servers
– GeoServer
– MapGuide Open Source
– Mapnik
– OpenStreetMap
– MapServer

• Spatial database management systems


– PostGIS
– ArangoDB
– SpatiaLite
– TerraLib
– OrientDB
Vector data structure
• Simple features– easy to create, store and lack connectivity
relationships so it is inefficient for modelling
– Point entities
– Line entities
– Simple polygons
• Topological features – mathematical procedure that describes
how features are spatially related and ensures data quality
– Networks
– Polygons with explicit topological structures
– Fully topological polygon network structure
– Triangular irregular network (TIN)
Simple features
• Point entities : These represent all geographical entities that are
positioned by a single XY coordinate pair. Along with the XY
coordinates the point must store other information such as what does
the point represent etc.
• Line entities : Linear features made by tracing two or more XY
coordinate pair.
– Simple line: It requires a start and an end point.
– Arc: A set of XY coordinate pairs describing a continuous
complex line. The shorter the line segment and the higher the
number of coordinate pairs, the closer the chain approximates a
complex curve.
• Simple Polygons : Enclosed structures formed by joining set of XY
coordinate pairs. The structure is simple but it carries few
disadvantages which are mentioned below:
– Lines between adjacent polygons must be digitized and stored
twice, improper digitization give rise to slivers and gaps
– Convey no information about neighbor
– Creating islands is not possible
Topological features

• Connectivity – information about the linkages


among spatial objects
• Contiguity – information about neighboring
spatial objects
• Containment – information about inclusion of
one spatial object within another spatial object
1. Connectivity
Arc node topology defines connectivity - arcs are connected to each other if
they share a common node. This is the basis for many network tracing and
path finding operations. Arcs represent linear features and the borders of area
features. Every arc has a from-node which is the first vertex in the arc and a to-
node which is the last vertex. These two nodes define the direction of the arc.
Nodes indicate the endpoints and intersections of arcs. They do not exist
independently and therefore cannot be added or deleted except by adding and
deleting arcs.
Contd.,
• Nodes can, however, be used to represent point features which connect
segments of a linear feature (e.g., intersections connecting street segments,
valves connecting pipe segments).
• Arc-node topology is supported through an arc-node list. For each arc in
the list there is a from node and a to node. Connected arcs are determined
by common node numbers.
Arc – node list
Arc From node To node
1 10 11
2 11 12
3 11 13
4 13 14
5 14 15
6 14 16
2. Contiguity
Polygon topology defines contiguity. The polygons are said to be
contiguous if they share a common arc. Contiguity allows the vector
data model to determine adjacency. The from node and to node of an
arc indicate its direction, and it helps determining the polygons on its
left and right side. Left-right topology refers to the polygons on the left
and right sides of an arc.
In the illustration, polygon B is on the left and polygon C is on the right
of the arc 4. Polygon A is outside the boundary of the area covered by
polygons B, C and D. It is called the external or universe polygon, and
represents the world outside the study area. The universe polygon
ensures that each arc always has a left and right side defined.
Contd.,

Left – Right Topology


Arc Left Right
polygon polygon
1 A B
2 A C
3 C D
4 B C
5 C D
6 B D
3. Containment
• Geographic features cover distinguishable area on the surface of the earth.
An area is represented by one or more boundaries defining a polygon. The
polygons can be simple or they can be complex with a hole or island in the
middle. In the illustration, assume a lake with an island in the middle. The
lake actually has two boundaries, one which defines its outer edge and the
other (island) which defines its inner edge. An island defines the inner
boundary of a polygon. The polygon D is made up of arc 5, 6 and 7. The 0
before the 7 indicates that the arc 7 creates an island in the polygon.
• Polygons are represented as an ordered list of arcs and not in terms of X, Y
coordinates. This is called Polygon-Arc topology. Since arcs define the
boundary of polygon, arc coordinates are stored only once, thereby
reducing the amount of data and ensuring no overlap of boundaries of the
adjacent polygons.
Contd.,

Polygon Arc Topology

Polygon Arc List

B 1,4,6,3

C 2,3,5,4

D 5,6,7

E 7
Networks
• A network is a topologic feature model which is defined as a line graph
composed of links representing linear channels of flow and nodes
representing their connections. The topologic relationship between the
features is maintained in a connectivity table. By consulting connectivity
table, it is possible to trace the information flowing in the network.

• Polygons with explicit topological structures : Introducing explicit


topological relationships takes care of islands as well as neighbors. The
topological structures are built either by creating topological links during
data input or using software. Dual Independent Map Encoding (DIME)
system of US Bureau of the Census is one of the first attempts to create
topology in geographic data.
Contd.,
• Polygons are formed using the lines and their
nodes.
• Once formed, polygons are individually
identified by a unique identification number.
• The topological information among the
polygons is computed and stored using the
adjacency information (the nodes of a line, and
identifiers of the polygons to the left and right
of the line) stored with the lines.
Contd.,
• Fully topological polygon network structure
A fully topological polygon network structure is built using boundary chains
that are digitized in any direction. It takes care of islands and lakes and allows
automatic checks for improper polygons. Neighborhood searches are fully
supported. These structures are edited by moving the coordinates of individual
points and nodes, by changing polygon attributes and by cutting out or adding
sections of lines or whole polygons.
• Triangular Irregular Network (TIN)
TIN represents surface as contiguous non-overlapping triangles created by
performing Delaunay triangulation. These triangles have a unique property that
the circumcircle that passes through the vertices of a triangle contains no other
point inside it. TIN is created from a set of mass points with x, y and z
coordinate values. This topologic data structure manages information about the
nodes that form each triangle and the neighbors of each triangle.
TIN representation
Raster data structure
• Entities are stored in a matrix of rectangular cells
• A code is given to each cell which informs users which entity is
present in which cell
Types
a) Entity model – It represents the whole raster data. Let us assume that
the raster data belongs to an area where land is surrounded by
water. Here a particular entity (land) is shown in green color and
the area where land is not present is shown by white.
Contd.,
b) Pixel values – The pixel value for the full
image is shown. Cells having a part of the
land are encoded as 1 and others where land
is not present are encoded as 0.

c) File structure – It demonstrates the method of


coding raster data. The first row of the file
structure data tells that there are 5 rows and 5
columns in the image, and 1 is the maximum
pixel value. The subsequent rows have cells
with value as either 0 or 1 (similar to pixel
values).

Note: Since raster data stored more space in the database,


encoding has to be done in order to reduce the storage
space
1. Run length encoding
– Reduction of data on a row by row basis
– Stores a single value for a group of cells rather than storing values for
individual cells
– First line represents the dimension of the matrix (5×5) and the number of
entities (1) present. In second and subsequent lines, the first number in the pair
represents absence (0) or presence (1) of the entity and the second number
indicates the number of cells referenced.
2. Block encoding
– Data is stored in blocks in the raster matrix
– The entry is subdivided into hierarchical blocks and blocks are located
using coordinates
– The 1st cell at top left hand is used as the origin for locating blocks
3. Chain encoding
– Works by defining boundary of the entity i.e. sequence of cells starting from
and returning to the given origin
– Direction of travel is specified using numbers. (0 = North, 1 = East, 2 = South,
3 = West)
– The first line tells that the coding started at cell (4, 2) and there is only one
chain. In the second line the first number in the pair tells the direction and the
second number represents the number of cells lying in this direction.
4. Quad tree
– A raster is divided into a hierarchy of quadrants that are subdivided
based on similar value pixels
– The division of raster stops when a quadrant is made entirely from cells
of same value
– A quadrant that cannot be subdivided is called as leaf node
Advantages and Disadvantages
Data model Advantages Disadvantages

Simple data structure Cell size determines the resolution at which


the data is represented
Compatible with remote sensing or Required a lot of storage space
RASTER scanned data
Spatial Analysis is easier Projection transformations are time
consuming
Simulation is easy because each unit has Network linkages are difficult to establish
same size and shape
Data is represented at its original The location of each vertex is to be stored
resolution and form without explicitly
generalization
Require less storage space Overlay based on criteria is difficult
VECTOR
Editing is faster and convenient Spatial analysis is cumbersome

Network analysis is fast Simulation is difficult because each unit has a


different topological form
Projection transformations are easier
References
• https://nptel.ac.in/courses/105102015/30
• https://nptel.ac.in/courses/105102015/31
• https://nptel.ac.in/courses/105102015/32

You might also like