Principles of GIS Study Guide PDF
Principles of GIS Study Guide PDF
I. Introduction
A GIS is a computer-based system with the following data capacities:
1. Capture and preparation
2. Management, including storage and maintenance
3. Manipulation and analysis
4. Presentation
Geo-Information Science: scientific field that attempts to integrate different disciplines studying
the methods and techniques of handling spatial information.
Data: representations that can be operated upon by a computer.
Spatial data: data that contains positional values, such as (x, y) co-ordinates→ Stored in Spatial
Databases or Geodatabases.
Information: data that has been interpreted by a human being.
Metadata: data about data (quality of the data, source, etc).
Errors in source data and processing errors resulting from spatial analysis and modelling
operations carried out by the system on the base data.
Key components of spatial data quality:
• Positional accuracy
• Temporal accuracy (up-to-date data)
• Attribute accuracy
• Lineage (history of data)
• Completeness (dataset contains everything that’s expected).
• Logical consistency: internal structure to defined model
Database: repository for storing large quantities of data. It allows:
• Concurrent use, storage optimization, data integrity, query facility and optimization.
1. Geographic field is a geographic phenomenon for which, for every point in the study area,
a value can be determined.
a. Discrete: divide the study space in mutually exclusive, bounded parts, with all locations
in one part having the same field value
b. Continuous: the field values along any path through the study area do not change
abruptly, but only gradually.
2. Geographic objects populate the study area, and are usually well distinguished, discrete,
and bounded entities. The space between them is potentially ‘empty’ or undetermined.
Data types and values
1. (Qualitative) Nominal data values: provide a name or identifier
2. Ordinal data values: can be put in some natural sequence but that do not allow any other
type of computation.
3. (Quantitative) Interval data values: quantitative, in that they allow simple forms of
computation like addition and subtraction.
4. (Quantitative) Ratio data values: allow most forms of arithmetic computation.
Geographic objects
When a geographic phenomenon is not present everywhere in the study area, but somehow
‘sparsely’ populates it, we look at it as a collection of geographic objects.
Location, shape (dimension: point, linear, polygon), size and orientation.
Boundaries
• Crisp boundary: be determined with almost arbitrary precision, dependent only on the data
acquisition technique applied.
• Fuzzy boundaries: is in itself an area of transition.
Line representations
Line data are used to represent one-dimensional objects: the two end nodes and zero or more
internal nodes or vertices define a line.
The straight parts of a line between two consecutive vertices or end nodes are called line segments.
Collections of (connected) lines may represent phenomena that are best viewed as networks.
Area representations
When area objects are stored using a vector approach, the usual technique is to apply a boundary
model. This means that each area feature is represented by some arc/node structure that
determines a polygon as the area’s boundary.
Each boundary is a cyclic sequence of line features; each line—as before—is a sequence of two
end nodes, with in between, zero or more vertices.
Data redundancy: The line that makes up the boundary two polygons is the same, which means
that using the above representation the line would be stored twice, namely once for each polygon.
Boundary model / topological data model is an improved representation: it stores parts of a
polygon’s boundary as non-looping arcs and in dictates which polygon is on the left and which is on
the right of each arc.
Temporal dimension
Spatiotemporal data models are ways of organizing representations of space and time in a GIS.
• Discrete and continuous time:
o Discrete time: discrete elements (seconds, minutes, hours, days, months, or years).
o Continuous time: for any two different points in time, there is always another point in
between.
• Transaction time (or database time): time when the event was stored in the database or
GIS.
• Linear, branching and cyclic time
• Time granularity: precision of time value in GIS or database.
• Absolute and relative time:
o Absolute: marks a point on the time line where events happen
o Relative: indicated relative to other points in time
In spatiotemporal analyses we consider changes of spatial and thematic attributes over time:
problem→ object identity.
o Spatial domain fixed and look only at the attribute changes over time for a given location
in space.
o Attribute domain fixed and consider the spatial changes over time for a given thematic
attribute.
o Both fixed: tracking moving objects.
In most of the available systems, spatial data is organized in layers by theme and/or scale.
Raster data storage: Stored in a file as a long list of values, one for each cell, preceded by a small
list of extra data ( ‘file header’) that informs how to interpret the long list → row ordering.
Spatial queries and process models play an important role in spatial analyisis. One of the key uses
of GISs has been to support spatial decisions.
Spatial decision support systems (SDSS): are a category of information systems composed of a
database, GIS software, models, and a knowledge engine which allow users to deal specifically with
locational problems.
Analysis of spatial data can be defined as computing new information that provides new insight
from the existing, stored spatial data:
Spatial data analysis presentation
The presentation may either be an end-product, or an intermediate product, as in spatial data made
available through the internet.
Database management systems (DBMSs)
DBMS is a software package that allows the user to set up, use and maintain a database:
o Storing large data sets
o Guard data correctness
o Concurrent use
o Declarative query language
o Supports the use of a data model
o Data backup and recovery
o Control data redundancy
Alternatives: spreadsheet program (only when data is small, numeric and of a single type of use).
Spatial Referencing
The Geoid and vertical datum
The Geoid is used to describe heights: the height is determined with respect to a tide-gauge station
is known as the orthometric height (height above H the Geoid) .
The Ellipsoid
It provides a relatively simple figure which fits the Geoid to a first order approximation, though for
small scale mapping purposes a sphere may be used.
A local horizontal datum is a reference surface used to define the elevation of points within
a specific geographic area. It is based on a fixed point or points within the area and the elevations
of other points in the area are determined relative to this fixed point.
The most important global (geocentric) spatial reference system for the GIS community is the
International Terrestrial Reference System (ITRS) .
o It is a 3D coordinate system with a well-defined origin (the center of mass ITRS of the Earth)
and three orthogonal coordinate axes . (X; Y;Z)
Coordinate systems
Planar / 2D
1. Geographic: φ, λ
The most widely used global coordinate system consists of lines of geographic latitude and
longitude
o Lines of equal latitude φ → parallels.
o Lines of equal longitude λ→ meridians
2. Cartesian: x, y
System of intersecting perpendicular lines, which contains X and Y axes
o X → horizontal
o Y → vertical
3. Polar: α, d
Distance d: from the origin to the point concerned and the angle α between a fixed direction
and the direction to the point.
o α is called Bearing or azimuth and is measured in a clockwise direction in angular
units.
o d is measured in length units.
o Three directions are used: True North, Grid North and Magnetic North.
Spatial / 3D
1. Geographic: φ, λ, h
Introduces the ellipsoidal height h: vertical distance of the point in question above the
ellipsoid.
2. Geocentric: x, y, z
The system has its origin at the mass-centre of the Earth with the X and Y axes in the plane
of the equator. The X-axis passes through the meridian of Greenwich, and the Z-axis
coincides with the Earth’s axis of rotation.
Map projection
Is mathematically described technique of how to represent the Earth’s curved surface on a flat map.
All maps include distortions → scale distortions (one cant flatten an ellipsoidal or spherical surface).
Means: transforming each point on the reference surface with geographic coordinates (φ, λ) to
a set of Cartesian coordinates (x, y) representing positions on the map plane.
Some map projections → visualized as true geometric projections directly onto the mapping plane
→ an azimuthal projection
Map projection classes:
1. Cylindrical
2. Conical
3. Azimuthal
Secant projection classes:
1. Cylindrical
2. Conical
3. Azimuthal
The Universal Transverse Mercator (UTM) uses a transverse cylinder, secant to the horizontal
reference surface. UTM is an important projection used worldwide.
o Derivation from the Transverse Mercator projection
o Divides the world into 60 narrow longitudinal zones of 6 degrees.
o The narrow zones of 6 degrees (and the secant map surface) make the distortions small
enough for large scale topographic mapping.
The distortion properties of a map are typically classified according to what is not distorted on the
map:
o Equidistant: length of particular lines in map = length of the original lines on the curved
reference surface.
o Conformal: angles between lines in the map = to the angles between the original lines on
the curved reference surface.
o Equal-area: areas in the map are = to the areas on the curved reference surface.
Satellite-based positioning
Absolute positioning
Absolute positioning: process of determining the exact location of a point on the Earth's surface
using satellite-based navigation systems, such as GPS.
It involves measuring the time delay between the transmission of a signal from a satellite and its
reception by a receiver on the ground, and using this information to calculate the distance
between the satellite and the receiver.
The location is determined in terms of latitude, longitude, and altitude.
Errors in Absolute Positioning
o Space segment
o Incorrect clock reading
o Incorrect orbit position
o Medium
o Troposhpere
o Ionosphere
o Receivers environment
o Multi-path: multiple receptions of the same signal may interfere with each other
o Satellite geometry errors: caused by the position of the satellites in the sky, when the
satellites are at a low elevation, this can cause errors in the calculated distances to the
satellites.
Relative positioning
Relative positioning is a method of determining the position of a point relative to a known reference
point, as opposed to determining an absolute position on the Earth's surface.
Network positioning
One or more control centres receive the reference station data, verify this for correctness, and
relay (uplink) this information to a geostationary satellite. The satellite will retransmit the
correctional data to the area that it covers, so that target receivers, using their own approximate
position, can determine how to correct for satellite signal error, and consequently obtain much more
accurate position fixes.
Code versus phase measurements
Only with relative positioning: aims to determine the number of cycles of the (sine-shaped) radio
signal between sender and receiver.
Positioning technology
GPS GLONASS Galileo
Satellites 24 24 27
Altitude 20,200 km 19,130 km 23,222 km
Orbital plane 6 3 3
Satellites per 5-7 - -
receiver
Reference WGS84 – aligned PZ-90 WGS84 and GTRF
system to ITRF
Time system UTC UTC International TAI
Atomic Time (TAI)
Satellite-based augmentation systems (SBAS): system that enhance the accuracy and integrity
of satellite-based navigation signals by using a network of ground-based reference stations that
monitor the navigation signals and then broadcast corrections and integrity information to users via
geostationary satellites.
o North America WAAS (Wide-Area Augmentation System)
o EGNOS (European Geostationary Navigation Overlay Service) for Europe,
o MSAS (Multi-functional Satellite Augmentation System) for eastern Asia.
V. Data Entry
Process of acquiring, digitizing, and importing geographic data into the GIS software.
Data quality
An accurate measurement: a mean close to the true value;
A precise measurement : a sufficiently small variance.
Positional accuracy
1. Human errors in measurement: large errors resulting from carelessness (never absolute
certainty).
2. Instrumental or systematic errors: vary systematically in sign and/or magnitude, but can
go undetected by repeating the measurement with the same instruments. ACCUMULATE
3. Random errors: caused by natural variations in the quantity being measured→ dealt with in
least–squares adjustment.
Root Mean Square Error (RMSE): measure of the difference between the predicted values and the
actual values of a dataset
Epsilon band: measure of the precision of geographic positions.
Accuracy Tolerances: acceptable levels of error for a given dataset or application
Natural uncertainty in spatial data: factors that can affect the positional accuracy of geographic
features → e.g. technology used to collect the data, the environment in which the data was collected,
and the inherent uncertainty of the feature being measured.
Attribute accuracy
Depend on type of data:
o Nominal or categorical data: accuracy of labeling → e.g. type of land cover.
o Numerical data: numerical accuracy → e.g. concentration of pollutant, height of trees.
Temporal accuracy
The accuracy of the time or date associated with a feature or dataset.
Lineage
The history of a dataset, including how it was created, modified, and used over time.
Completeness
Refers to whether a dataset contains all of the features and attributes that it is supposed to have.
Logical consistency
Refers to the internal consistency of a dataset.
Data preparation
Data checks and repairs
Rasterization
Process of converting vector data, which is composed of points, lines, and polygons, into a raster
format, which is composed of pixels or cells
Vectorization
Converts raster data into vector data.
Associating attributes
Process of linking data in one table to data in another table.
Topology checks refer
Process of checking the spatial relationships between features in a dataset, such as the adjacency,
connectivity and containment.
Combining data from multiple sources
1. Same area, but differ in accuracy
2. Same area, but differ in choice of representation
3. Adjacent areas, and have to be merged into a single data set.
4. Same or adjacent areas, but referenced in different coordinate systems.
The map scale is the ratio between a distance on the map and the corresponding distance in
reality.
A topographic map visualizes, limited by its scale, the Earth’s surface as accurately as possible.
Thematic maps represent the distribution of particular themes. One can distinguish between socio-
economic themes and physical themes.
Visual variables
o Size,
o Value (lightness),
o Texture
o Colour,
o Orientation and
o Shape.
How to Map
Key considerations:
o Map purpose: The map should be designed with a specific purpose in mind, such as
informing, educating, or persuading.
o Map scale: The map should be designed at an appropriate scale, taking into account the
size of the area being mapped and the level of detail required.
o Map projections: The map should be designed using an appropriate projection, taking into
account the shape of the earth and the area being mapped.
o Map symbols and colors: The map should use appropriate symbols and colors to represent
the data, taking into account the meaning of the data and the audience for the map.
o Map layout and design: The map should be designed with a clear and logical layout, taking
into account the importance of the data and the audience for the map.
o Map accuracy and precision: The map should be designed with the appropriate level of
accuracy and precision, taking into account the source data and the intended use of the map.
o Map labeling and annotation: The map should be labeled and annotated in a clear and
consistent way, taking into account the importance of the data and the audience for the map.
o Map evaluation: The map should be evaluated for its effectiveness in communicating the
data it represents.