0% found this document useful (0 votes)
77 views16 pages

Principles of GIS Study Guide PDF

A GIS allows for the capture, storage, manipulation, analysis and presentation of geospatial data. It operates under the assumption that geographic phenomena can be represented in a two or three dimensional space. Key components of spatial data quality include positional accuracy, temporal accuracy, attribute accuracy, logical consistency and completeness. Spatial data in a GIS can take the form of raster data represented as a grid of cells or vector data representing points, lines and polygons. Proper data management is important for tasks like overlaying spatial data layers and analyzing changes over different points in time.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views16 pages

Principles of GIS Study Guide PDF

A GIS allows for the capture, storage, manipulation, analysis and presentation of geospatial data. It operates under the assumption that geographic phenomena can be represented in a two or three dimensional space. Key components of spatial data quality include positional accuracy, temporal accuracy, attribute accuracy, logical consistency and completeness. Spatial data in a GIS can take the form of raster data represented as a grid of cells or vector data representing points, lines and polygons. Proper data management is important for tasks like overlaying spatial data layers and analyzing changes over different points in time.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Principles of GIS

I. Introduction
A GIS is a computer-based system with the following data capacities:
1. Capture and preparation
2. Management, including storage and maintenance
3. Manipulation and analysis
4. Presentation
Geo-Information Science: scientific field that attempts to integrate different disciplines studying
the methods and techniques of handling spatial information.
Data: representations that can be operated upon by a computer.
Spatial data: data that contains positional values, such as (x, y) co-ordinates→ Stored in Spatial
Databases or Geodatabases.
Information: data that has been interpreted by a human being.
Metadata: data about data (quality of the data, source, etc).
Errors in source data and processing errors resulting from spatial analysis and modelling
operations carried out by the system on the base data.
Key components of spatial data quality:
• Positional accuracy
• Temporal accuracy (up-to-date data)
• Attribute accuracy
• Lineage (history of data)
• Completeness (dataset contains everything that’s expected).
• Logical consistency: internal structure to defined model
Database: repository for storing large quantities of data. It allows:
• Concurrent use, storage optimization, data integrity, query facility and optimization.

II. Geographic information and Spatial data types


Modelling: process of producing an abstraction of the ‘real world’ so that some part of it can be
more easily handled.
We use the GIS to create visualizations from the computer representation.
A GIS operates under the assumption that the relevant spatial phenomena occur in a two- or
three-dimensional Euclidean space.
A geographic phenomenon is the manifestation of an entity or process of interest that:
• Can be named or described,
• Can be georeferenced, and
• Can be assigned a time (interval) at which it is/was present.

1. Geographic field is a geographic phenomenon for which, for every point in the study area,
a value can be determined.
a. Discrete: divide the study space in mutually exclusive, bounded parts, with all locations
in one part having the same field value
b. Continuous: the field values along any path through the study area do not change
abruptly, but only gradually.

2. Geographic objects populate the study area, and are usually well distinguished, discrete,
and bounded entities. The space between them is potentially ‘empty’ or undetermined.
Data types and values
1. (Qualitative) Nominal data values: provide a name or identifier
2. Ordinal data values: can be put in some natural sequence but that do not allow any other
type of computation.
3. (Quantitative) Interval data values: quantitative, in that they allow simple forms of
computation like addition and subtraction.
4. (Quantitative) Ratio data values: allow most forms of arithmetic computation.

Geographic objects
When a geographic phenomenon is not present everywhere in the study area, but somehow
‘sparsely’ populates it, we look at it as a collection of geographic objects.
Location, shape (dimension: point, linear, polygon), size and orientation.
Boundaries
• Crisp boundary: be determined with almost arbitrary precision, dependent only on the data
acquisition technique applied.
• Fuzzy boundaries: is in itself an area of transition.

Computer representations of geographic information


We can use an interpolation function that allows us to infer a reasonable elevation value for
locations that are not stored.
A simple and commonly used interpolation function takes the elevation value of the nearest location
that is stored.
Interpolation is made possible by a principle called spatial autocorrelation: the fact that locations
that are closer together are more likely to have similar values than locations that are far apart
Tesselation
A tessellation (or tiling): is a partitioning of space into mutually cells that together make up the
complete study space. With each cell, some (thematic) value is associated to characterize that part
of space.
A tessellation in GIS: rasters.
Irregular tessellations are more complex than the regular ones, but they are also more adaptive
→ reduced memory data.
If we restrict the use of a plane to the area between its three anchor points, we obtain a
triangular tessellation of the complete study space
Raster: set of regularly spaced (and contiguous) cells with associated (field) values. The associated
values represent cell values, not point values. This means that the value for a cell is assumed to be
valid for all locations within the cell.
Size of the raster cell = raster resolution.
Grid: is a collection of regularly spaced (field) values.
Some convention is needed to state which value prevails on cell boundaries; with square cells, this
convention often says that lower and left boundaries belong to the cell. To improve on this continuity
issue, we can do two things: a) make the cell smaller or b) assume cell value only represents
elevation for one specific location in the cell.
Vector representations
Vector representations: explicitly associate georeferences with the geographic phenomena.
A commonly used data structure in GIS software is the triangulated irregular network, or TIN: they
represent a continuous field = built from a set of locations with measurement.
Point representations
Points: single coordinate pairs when we work in 2D, (x; y) or coordinate triplets when we work in
3D. (x; y; z): shape- and sizeless, one-dimensional features.

Line representations
Line data are used to represent one-dimensional objects: the two end nodes and zero or more
internal nodes or vertices define a line.
The straight parts of a line between two consecutive vertices or end nodes are called line segments.
Collections of (connected) lines may represent phenomena that are best viewed as networks.
Area representations
When area objects are stored using a vector approach, the usual technique is to apply a boundary
model. This means that each area feature is represented by some arc/node structure that
determines a polygon as the area’s boundary.
Each boundary is a cyclic sequence of line features; each line—as before—is a sequence of two
end nodes, with in between, zero or more vertices.
Data redundancy: The line that makes up the boundary two polygons is the same, which means
that using the above representation the line would be stored twice, namely once for each polygon.
Boundary model / topological data model is an improved representation: it stores parts of a
polygon’s boundary as non-looping arcs and in dictates which polygon is on the left and which is on
the right of each arc.

General spatial topology


Topology: the spatial relationships between geographical elements in a data set that do not change
under a continuous transformation.
• Space is Euclidean
• Space is metric
• Space is topological
• Interior and boundary are properties of spatial features invariant under topological mappings.
We can define within the topological space, features that are easy to handle, simplices: the simplest
geometric shapes of some dimension:
• point (0-simplex),
• line segment (1-simplex),
• triangle (2-simplex), and
• tetrahedron (3-simplex).
Spatial relationships:
Topological rules:

Scale and resolution


Map scale: the ratio between the distance on a paper map and the distance of the same stretch in
the terrain. When applied to spatial data, the term resolution is commonly associated with the cell
width of the tessellation applied.
Raster representation
Long list of field values: list is preceded with some extra information.

Vector representation of a field


This technique uses isolines of the field: a linear feature that connects the points with equal
field value. When the field is elevation: contour lines.
Representation of geographic objects
Various techniques exist to process digital images into classified images that can be stored in a GIS
as a raster. Image classification attempts to characterize each pixel into one of a finite list of
classes, thereby obtaining an interpretation of the contents of the image.
Raster data formats → good for area objects, possible for line objects.
Vector representation → more natural

Organizing and managing spatial data


A spatial data layer is either a representation of a continuous or discrete field, or a collection
of objects of the same kind.
Data layers can be overlaid with each other, inside the GIS package, so as to study combinations
of geographic phenomena.

Temporal dimension
Spatiotemporal data models are ways of organizing representations of space and time in a GIS.
• Discrete and continuous time:
o Discrete time: discrete elements (seconds, minutes, hours, days, months, or years).
o Continuous time: for any two different points in time, there is always another point in
between.
• Transaction time (or database time): time when the event was stored in the database or
GIS.
• Linear, branching and cyclic time
• Time granularity: precision of time value in GIS or database.
• Absolute and relative time:
o Absolute: marks a point on the time line where events happen
o Relative: indicated relative to other points in time
In spatiotemporal analyses we consider changes of spatial and thematic attributes over time:
problem→ object identity.
o Spatial domain fixed and look only at the attribute changes over time for a given location
in space.
o Attribute domain fixed and consider the spatial changes over time for a given thematic
attribute.
o Both fixed: tracking moving objects.

III. Data management and processing systems


Functional components of a GIS:

Spatial Data Infrastructure


SDI: relevant base collection of technologies, policies and institutional arrangements that facilitate
the availability of and access to spatial data
Traditional techniques for obtaining spatial data, typically from paper sources, included manual
digitizing and scanning.

In most of the available systems, spatial data is organized in layers by theme and/or scale.
Raster data storage: Stored in a file as a long list of values, one for each cell, preceded by a small
list of extra data ( ‘file header’) that informs how to interpret the long list → row ordering.

Spatial queries and process models play an important role in spatial analyisis. One of the key uses
of GISs has been to support spatial decisions.
Spatial decision support systems (SDSS): are a category of information systems composed of a
database, GIS software, models, and a knowledge engine which allow users to deal specifically with
locational problems.
Analysis of spatial data can be defined as computing new information that provides new insight
from the existing, stored spatial data:
Spatial data analysis presentation
The presentation may either be an end-product, or an intermediate product, as in spatial data made
available through the internet.
Database management systems (DBMSs)
DBMS is a software package that allows the user to set up, use and maintain a database:
o Storing large data sets
o Guard data correctness
o Concurrent use
o Declarative query language
o Supports the use of a data model
o Data backup and recovery
o Control data redundancy
Alternatives: spreadsheet program (only when data is small, numeric and of a single type of use).

Relational Data Model


The relational data model organizes data into one or more tables (or "relations") of rows and
columns, with each table consisting of a set of designated columns and a unique key.
The relationships between tables are established using the values of the unique keys, which act as
foreign keys in related tables.
A "relation" is a table of data that is organized into rows and columns. Each row in the table is called
a "tuple," and each column is called an "attribute."
A relation is defined by a set of attributes that describe the properties of the data it contains.
Each attribute has a name and a data type, such as "integer" or "string."
A tuple is a single row of data in a relation, and it contains a value for each attribute of the relation.
The unique key attribute(s) of a relation is used to identify a tuple uniquely in a relation. This
can be used to make sure that each tuple in a relation is unique, and it is also used to establish
relationships between different relations.
Structures used to define the database are attributes, tuples and relations.
Database→ table → collection of tuples (records) → each tuple has an attribute.

GIS and spatial databases


GIS software provides support for spatial data and thematic or attribute data.

IV. Spatial referencing and positioning

Spatial Referencing
The Geoid and vertical datum
The Geoid is used to describe heights: the height is determined with respect to a tide-gauge station
is known as the orthometric height (height above H the Geoid) .

The Ellipsoid
It provides a relatively simple figure which fits the Geoid to a first order approximation, though for
small scale mapping purposes a sphere may be used.
A local horizontal datum is a reference surface used to define the elevation of points within
a specific geographic area. It is based on a fixed point or points within the area and the elevations
of other points in the area are determined relative to this fixed point.
The most important global (geocentric) spatial reference system for the GIS community is the
International Terrestrial Reference System (ITRS) .
o It is a 3D coordinate system with a well-defined origin (the center of mass ITRS of the Earth)
and three orthogonal coordinate axes . (X; Y;Z)
Coordinate systems
Planar / 2D
1. Geographic: φ, λ
The most widely used global coordinate system consists of lines of geographic latitude and
longitude
o Lines of equal latitude φ → parallels.
o Lines of equal longitude λ→ meridians

2. Cartesian: x, y
System of intersecting perpendicular lines, which contains X and Y axes
o X → horizontal
o Y → vertical

3. Polar: α, d
Distance d: from the origin to the point concerned and the angle α between a fixed direction
and the direction to the point.
o α is called Bearing or azimuth and is measured in a clockwise direction in angular
units.
o d is measured in length units.
o Three directions are used: True North, Grid North and Magnetic North.
Spatial / 3D
1. Geographic: φ, λ, h
Introduces the ellipsoidal height h: vertical distance of the point in question above the
ellipsoid.

2. Geocentric: x, y, z
The system has its origin at the mass-centre of the Earth with the X and Y axes in the plane
of the equator. The X-axis passes through the meridian of Greenwich, and the Z-axis
coincides with the Earth’s axis of rotation.
Map projection
Is mathematically described technique of how to represent the Earth’s curved surface on a flat map.
All maps include distortions → scale distortions (one cant flatten an ellipsoidal or spherical surface).
Means: transforming each point on the reference surface with geographic coordinates (φ, λ) to
a set of Cartesian coordinates (x, y) representing positions on the map plane.
Some map projections → visualized as true geometric projections directly onto the mapping plane
→ an azimuthal projection
Map projection classes:
1. Cylindrical
2. Conical
3. Azimuthal
Secant projection classes:
1. Cylindrical
2. Conical
3. Azimuthal
The Universal Transverse Mercator (UTM) uses a transverse cylinder, secant to the horizontal
reference surface. UTM is an important projection used worldwide.
o Derivation from the Transverse Mercator projection
o Divides the world into 60 narrow longitudinal zones of 6 degrees.
o The narrow zones of 6 degrees (and the secant map surface) make the distortions small
enough for large scale topographic mapping.
The distortion properties of a map are typically classified according to what is not distorted on the
map:
o Equidistant: length of particular lines in map = length of the original lines on the curved
reference surface.
o Conformal: angles between lines in the map = to the angles between the original lines on
the curved reference surface.
o Equal-area: areas in the map are = to the areas on the curved reference surface.

Satellite-based positioning
Absolute positioning
Absolute positioning: process of determining the exact location of a point on the Earth's surface
using satellite-based navigation systems, such as GPS.
It involves measuring the time delay between the transmission of a signal from a satellite and its
reception by a receiver on the ground, and using this information to calculate the distance
between the satellite and the receiver.
The location is determined in terms of latitude, longitude, and altitude.
Errors in Absolute Positioning
o Space segment
o Incorrect clock reading
o Incorrect orbit position
o Medium
o Troposhpere
o Ionosphere
o Receivers environment
o Multi-path: multiple receptions of the same signal may interfere with each other
o Satellite geometry errors: caused by the position of the satellites in the sky, when the
satellites are at a low elevation, this can cause errors in the calculated distances to the
satellites.

Relative positioning
Relative positioning is a method of determining the position of a point relative to a known reference
point, as opposed to determining an absolute position on the Earth's surface.
Network positioning
One or more control centres receive the reference station data, verify this for correctness, and
relay (uplink) this information to a geostationary satellite. The satellite will retransmit the
correctional data to the area that it covers, so that target receivers, using their own approximate
position, can determine how to correct for satellite signal error, and consequently obtain much more
accurate position fixes.
Code versus phase measurements
Only with relative positioning: aims to determine the number of cycles of the (sine-shaped) radio
signal between sender and receiver.

Positioning technology
GPS GLONASS Galileo
Satellites 24 24 27
Altitude 20,200 km 19,130 km 23,222 km
Orbital plane 6 3 3
Satellites per 5-7 - -
receiver
Reference WGS84 – aligned PZ-90 WGS84 and GTRF
system to ITRF
Time system UTC UTC International TAI
Atomic Time (TAI)

Satellite-based augmentation systems (SBAS): system that enhance the accuracy and integrity
of satellite-based navigation signals by using a network of ground-based reference stations that
monitor the navigation signals and then broadcast corrections and integrity information to users via
geostationary satellites.
o North America WAAS (Wide-Area Augmentation System)
o EGNOS (European Geostationary Navigation Overlay Service) for Europe,
o MSAS (Multi-functional Satellite Augmentation System) for eastern Asia.

V. Data Entry
Process of acquiring, digitizing, and importing geographic data into the GIS software.

Spatial data input


Direct spatial data capture
This data can come from a variety of sources, including satellite imagery, aerial photography, field
surveys, and existing maps and databases.
Data which is captured directly from the environment = primary data.
Indirect spatial data capture
o Digitizing manually tracing features from a hard-copy map or aerial photograph and
converting them into digital form by using a digitizing tablet or similar device.
o Scanning: This involves converting hard-copy maps and aerial photographs into digital form
by using a scanner.
o Vectorization: distilling points, lines and polygons from a scanned image.

Other sources of spatial data


o Importing from spatial data clearing house / infrastructure: This involves importing data
from existing GIS databases or other sources in a variety of formats, such as shapefiles or
geodatabases.
o Metadata: background information that describes all necessary in formation about the data
itself.
o Identification
o Data quality
o Entity and attribute
o Metadata standards for digital spatial data
▪ International Organization for Standardization (ISO) and the Open Geospatial
Consortium (OGC) standards.

Data quality
An accurate measurement: a mean close to the true value;
A precise measurement : a sufficiently small variance.
Positional accuracy
1. Human errors in measurement: large errors resulting from carelessness (never absolute
certainty).
2. Instrumental or systematic errors: vary systematically in sign and/or magnitude, but can
go undetected by repeating the measurement with the same instruments. ACCUMULATE
3. Random errors: caused by natural variations in the quantity being measured→ dealt with in
least–squares adjustment.
Root Mean Square Error (RMSE): measure of the difference between the predicted values and the
actual values of a dataset
Epsilon band: measure of the precision of geographic positions.
Accuracy Tolerances: acceptable levels of error for a given dataset or application
Natural uncertainty in spatial data: factors that can affect the positional accuracy of geographic
features → e.g. technology used to collect the data, the environment in which the data was collected,
and the inherent uncertainty of the feature being measured.

Attribute accuracy
Depend on type of data:
o Nominal or categorical data: accuracy of labeling → e.g. type of land cover.
o Numerical data: numerical accuracy → e.g. concentration of pollutant, height of trees.

Temporal accuracy
The accuracy of the time or date associated with a feature or dataset.
Lineage
The history of a dataset, including how it was created, modified, and used over time.
Completeness
Refers to whether a dataset contains all of the features and attributes that it is supposed to have.
Logical consistency
Refers to the internal consistency of a dataset.

Data preparation
Data checks and repairs
Rasterization
Process of converting vector data, which is composed of points, lines, and polygons, into a raster
format, which is composed of pixels or cells
Vectorization
Converts raster data into vector data.
Associating attributes
Process of linking data in one table to data in another table.
Topology checks refer
Process of checking the spatial relationships between features in a dataset, such as the adjacency,
connectivity and containment.
Combining data from multiple sources
1. Same area, but differ in accuracy
2. Same area, but differ in choice of representation
3. Adjacent areas, and have to be merged into a single data set.
4. Same or adjacent areas, but referenced in different coordinate systems.

Data point transformation


Transform our points into other representations in order to facilitate interpretation and/or integration
with other data → interpolation
Interpolating discrete data
Data is represented as a set of discrete points, such as x, y coordinates and a value at each point.
This type of interpolation is used to estimate the value at a location between the known data points.
1. Nearest-neighbour interpolation: each location is assigned the value of the closest
measured point.
If the desired output was a polygon layer, we could construct Thiessen polygons around the points
of measurement. The boundaries of such polygons, by definition are the locations for which more
than one point of measurement is the closest point.
Interpolating continuous data
1. Trend surface fitting using regression,
2. Triangulation,
3. Spatial moving averages using inverse distance weighting,
4. Kriging.

VI. Spatial data analysis

Classification of analytical GIS capabilities


Classification, retrieval, and measurement functions.
All functions in this category are performed on a single (vector or raster) data layer, often using the
associated attribute data.
1. Classification allows the assignment of features to a class on the basis of attribute values
or attribute ranges (definition of data patterns).
2. Retrieval allow the selective search of data.
3. Generalization: joins different classes of objects with common characteristics to a
higher level (generalized) class.
4. Measurement: allow the calculation of distances, lengths, or areas.
Overlay functions
Combination of two (or more) spatial data layers comparing them position by position, and treating
areas of overlap—and of non-overlap—in distinct ways.
o Intersection, union, difference, complement
Neighbourhood functions
1. Search functions: retreival of features on a search window
2. Buffer zone: spatial envelope around a feature
3. Interpolation: predict unknown values using the known val ues at nearby locations.
4. Topographic: determine characteristics of an area by looking at the immediate
neighbourhood as well.
Connectivity functions.
Connectivity functions are a type of spatial analysis function that work on the basis of networks.
o Contiguity: evaluate a characteristic of a set of connected spatial units, for example, the
search for a contiguous area of forest of a certain size and shape in a satellite image.
o Network analytic: are used to compute over connected line features that make up a network.
o Visibility functions are used to compute the points visible from a given location using a
DTM.

GIS and Application Models


Characteristics of GIS-Based models
1. The purpose
a. Descriptive
b. Prescriptive
c. Predictive
2. The methodology
a. Stochastic
b. Deterministic
c. Rule-based
d. Agent-based
3. The scale
a. Individual
b. Aggregate
4. Its dimensionality
a. Static: no time
b. Dynamic: time is essential
5. Implementation logic
a. Inductive
b. Deductive
c. Coupling

Error propagation in spatial data processing


The accumulation of errors that occur during the processing and manipulation of spatial data. It can
occur at multiple stages of spatial data processing.
Quantifying error propagation
Estimating the magnitude of the errors that occur during spatial data processing and determining
how these errors will affect the final results. This can be done by using a combination of statistical
methods, such as error propagation analysis, and visual inspection methods such as error maps.
One common method for quantifying error propagation is to use a propagation of error
formula.
VII. Data Visualization

GIS and Maps


Can be used to identify patterns and trends in the data, such as the distribution of poverty or the
spread of a disease. They can also be used to create informative and compelling visualizations
that help users understand complex data in a visual and intuitive way.

The map scale is the ratio between a distance on the map and the corresponding distance in
reality.
A topographic map visualizes, limited by its scale, the Earth’s surface as accurately as possible.
Thematic maps represent the distribution of particular themes. One can distinguish between socio-
economic themes and physical themes.
Visual variables
o Size,
o Value (lightness),
o Texture
o Colour,
o Orientation and
o Shape.

How to Map
Key considerations:
o Map purpose: The map should be designed with a specific purpose in mind, such as
informing, educating, or persuading.
o Map scale: The map should be designed at an appropriate scale, taking into account the
size of the area being mapped and the level of detail required.
o Map projections: The map should be designed using an appropriate projection, taking into
account the shape of the earth and the area being mapped.
o Map symbols and colors: The map should use appropriate symbols and colors to represent
the data, taking into account the meaning of the data and the audience for the map.
o Map layout and design: The map should be designed with a clear and logical layout, taking
into account the importance of the data and the audience for the map.
o Map accuracy and precision: The map should be designed with the appropriate level of
accuracy and precision, taking into account the source data and the intended use of the map.
o Map labeling and annotation: The map should be labeled and annotated in a clear and
consistent way, taking into account the importance of the data and the audience for the map.
o Map evaluation: The map should be evaluated for its effectiveness in communicating the
data it represents.

You might also like