Learning Unit 4
Learning Unit 4
2023
Learning Unit 4:
Data Analysis
1
GGH3703: Introduction to GIS
2023
Contents
1. An overview of this learning unit .............................................................................................................. 4
2. Differentiating between data and information ........................................................................................ 4
3. The analysis of spatial data ....................................................................................................................... 5
4. Analysis methods ...................................................................................................................................... 7
5. Vector analysis methods ........................................................................................................................... 7
5.1 Searches and queries .......................................................................................................................... 7
5.2 Data classification ............................................................................................................................... 7
5.3 Measurements .................................................................................................................................... 8
5.4 Single layer analysis ............................................................................................................................ 9
1) Buffering ........................................................................................................................................... 9
2) Geoprocessing operations ................................................................................................................ 9
5.5 Multiple layer analysis ........................................................................................................................ 9
1) Overlay operations ........................................................................................................................... 9
2) Spatial joins ..................................................................................................................................... 10
6. Vector errors ........................................................................................................................................... 10
Practical activity 4.1: Measuring distances and areas ................................................................................ 10
Practical activity 4.2: Attribute queries and buffers ................................................................................... 10
Practical activity 4.3:Selections and overlays ............................................................................................. 11
7. Raster data analysis ................................................................................................................................ 11
7.1 Single layer analysis .......................................................................................................................... 11
1) Reclassification ............................................................................................................................... 11
2) Raster buffers.................................................................................................................................. 11
7.2 Multiple layer analysis ...................................................................................................................... 11
1) Mathematical raster overlays ......................................................................................................... 12
2) Boolean raster overlays .................................................................................................................. 12
3) Relational raster overlays ............................................................................................................... 12
8. Scale of analysis ...................................................................................................................................... 12
9. Surface analysis ....................................................................................................................................... 12
9.1 Creating surfaces............................................................................................................................... 12
9.2 Terrain mapping ................................................................................................................................ 12
Optional exercise: Practical activity 4.4: Interpolation ............................................................................... 13
2
GGH3703: Introduction to GIS
2023
10. Example: Determining the different analytical approaches to determine the answers to the questions
.................................................................................................................................................................... 13
Scenario 1: Building a new house ........................................................................................................... 13
Scenario 2: Rhino poaching..................................................................................................................... 14
Scenario 3: Water provisioning ........................................................................................................... 15
Activity: Application of analysis functions .................................................................................................. 16
11. Concluding data analysis ....................................................................................................................... 16
3
GGH3703: Introduction to GIS
2023
read various online resources to familiarise yourself with the analysis of raster and vector data
search the internet for resources to familiarise yourself with some of the methods used to
transform spatial data into spatial information
complete activities that form part of the assignments
complete practical exercises using QGIS
Learning unit 4 introduces you to various resources that form part of your learning material. It is
essential that you complete ALL the activities of this learning unit before you proceed to the next
learning unit.
This learning unit will provide you with the opportunity to explore different methods used when
analysing spatial data. The theoretical component will focus on the analysis of raster and vector data
while the practical activities will mostly focus on the analysis of vector data. This learning unit will
constantly switch between the theory component and practical activities. Keep your QGIS project open
when you work through this learning unit. You will benefit from working on the theory and practical
activities of this learning unit concurrently.
It is extremely important that you distinguish between the methods used for raster analysis and the
methods used for vector analysis.
Search for resources online to differentiate between the terms “data” and “information”. To make sure
that you are familiar with these two terms, write down a definition for each of the terms and write
down a sentence or two to differentiate between data and information.
Statistics allows us to convert numerical data into information by applying certain analysis methods and
functions. Most importantly, statistics is a way of converting non‐spatial data into information. Since we
are dealing with spatial data in geography, consider the type of processing that needs to be done to
transform spatial data into spatial information.
4
GGH3703: Introduction to GIS
2023
We conduct spatial analysis to transform the data we have into meaningful information. Follow this link
for an easy description of what spatial analysis is:
https://apps.carleton.edu/collab/spatial_analysis/SpatialAnalysis/
A more formal definition of spatial analysis can be found on the following website:
http://dictionary.sensagent.com/Spatial%20analysis/en-en/
Search for two more definitions of spatial analysis on the internet. Read the four definitions critically.
You will notice that these definitions have some commonalities. The most important common
characteristic is that it refers to data with a location on earth. The second most important
characteristic is that the data is analysed in space instead of just tables or lists of data, thirdly, the
analysis is also concerned with the relationship among the different features (features occurring a
distance from other features or features contained or intersected by other features).
We conduct spatial analysis to transform the data we have into information that informs us (or the
person for whom we did the analysis). While we will deal with the outputs of GIS in the next learning
unit, maps are often used to communicate the information that we have obtained from analysis.
When searching for the definition of spatial analysis, you may have noticed that we use
specific techniques and functions in the GIS software to obtain information from spatial
data. It is very important that you distinguish between the different terminologies used
for the theory component of this module and the terminology used for the software
5
GGH3703: Introduction to GIS
2023
functions. It has been proven that once a person understands the theory component and
knows what needs to be, they are able to sort out how to do it. For the purpose of this
module and for answering your assignments you will be required to use the theory
terminology.
In the rest of this learning unit, you will be introduced to various spatial data analysis methods through
various examples. Since this module is introductory in nature, we will only explore some of the most
basic techniques that can be used to obtain information from spatial data, and will not cover the
advanced techniques of obtaining information from spatial data. Once again, please keep in mind that
the focus of this module is on understanding. It is essential that you understand why each of the analysis
techniques is necessary and why and where we would typically apply these techniques. The raster and
vector analysis methods that will be explained in the module is summarised in figure 1.
Keep this summary at hand to guide you through your studies. The summary of analysis methods is
based on the book Essentials of geographic information systems by Saylor Academy (2012). It can be
found by following this link: https://saylordotorg.github.io/text_essentials-of-geographic-
information-systems/index.html . We will focus specifically on chapters 6, 7 and 8.
6
GGH3703: Introduction to GIS
2023
4. Analysis methods
The e‐book referenced in the previous section explains the analysis methods very well. I will therefore
only provide the headings and a short summary of each analysis method.
It is extremely important that you distinguish between the different vector and raster
data analysis methods. Raster and vector data are different spatial data models (refer
back to learning unit 3). There are fundamental differences in the file structures of the
spatial data models, hence the need for different analysis methods. If you would like to
apply a vector analysis method (e.g. a spatial join) on a raster data model, you first need
to do a data conversion from raster to vector before applying the analysis method.
Please see figure 1 for the different vector and raster analysis methods.
Please note that it will not be expected of you to write your own SQL expressions. When you do the
practical activity, you will see that QGIS greatly assists you through the process. However, it is important
that you understand the basic concepts.
Venn diagrams are extremely important. Venn diagrams are used to illustrate the outputs of Boolean
operators. Boolean operators include the statements of AND, OR, XOR and NOT. Please study these
carefully.
Take note of all the different ways in which query by geography (or spatial queries) can be conducted.
When asked to identify analysis methods used in various applications, you will be expected to indicate
the relevant spatial relationship as well.
Take note of the different classification methods used to create a classification map. Make sure that you
know when each method should be used. When answering your assignment, you will be required to
identify the correct classification method to be used when creating classification maps.
7
GGH3703: Introduction to GIS
2023
5.3 Measurements
Apart from searching for specific attribute data, we are sometimes also interested in distance
measurements on a map or in a GIS.
Before considering how distance is calculated in a GIS, let us make a quick comparison between distance
measurements on a map and in a GIS.
Maps have a constant scale, which represents the extent to which reality has been reduced to fit on the
map sheet. For example, the map scale of a topographical map is 1:50 000, which means that one unit of
measurement on the map represents 50 000 of that same unit of measurement in reality. In other
words, 1 cm on the map is equal to a distance of 50 000 cm in reality.
Since the scale of a map is fixed, we can easily calculate the real distance of a measurement on a map by
multiplying it by the map scale. For example:
If the measurement on the map is 5 cm, the distance that it represents in reality is 250 000 cm (5 x
50 000).
However, no one would say that they live 250 000 cm from a school. Therefore, we convert the distance
from centimetres to a more practical unit, such as metres or kilometres.
We can easily convert between the different units of measurement in the same measurement system
(the SI system) by applying simple multiplication factors. For example, converting from centimetres to
kilometres simply requires division by 100 000. So, 250 000 cm is equal to 2,5 km.
At this stage of your studies, you should be completely familiar with the metric measurement system
and how to convert between the different units of measurement. However, if you need to refresh your
memory on measurement systems and converting between units of measurement, explore the
following resources:
Why do we need to discuss the measurement of distance on maps before we go on to consider the
measurement of distance in a GIS?
Unlike maps, a GIS does not have a map scale. Visit Google Maps and explore the concept of no scale by
zooming in and out of certain locations on the map. To see how a real‐world distance is represented by
varying distances (depending on your level of zoom) rather than a fixed distance dictated by a particular
map scale, click on the "Get Directions" button in Google Maps and get directions between
Johannesburg and Cape Town. Once the route between these two points is indicated in blue, use the
zoom function to zoom in and out. Notice how the real distance between these two points remains the
same (in the directions panel), but the distance on the computer screen changes as you zoom in and
out.
Explore the Selection and measurement section in Chapter 5 of GIS Commons for a short
overview of measurement in GIS.
8
GGH3703: Introduction to GIS
2023
We are sure you have realised, from the GIS Commons sections, that as with
most techniques, there is a difference between measuring distance in raster and
in vector. Read the following booklet for an explanation of how distance is
calculated in a raster and vector GIS respectively:
“Distance in GIS” under the Additional resources for Learning unit 4 tab.
It is essential that you know the difference between how measurements are done in a raster and vector
GIS. Most likely, you will only select the type of measurement method when working with a GIS, and will
not apply the theory that you have learned. However, it is essential that you are cognisant of the errors
that may be present since the different methods will definitely have an effect on the answer provided by
the GIS.
Single layer analysis can be subdivided into buffers and geoprocessing operations.
1) Buffering
Make sure that you are able to distinguish between the different buffering options, namely constant
width buffers, variable width buffers, doughnut buffers, multiple ring buffers, setback buffers, non‐
dissolved buffers and dissolved buffers. Do research on the internet and identify scenarios when each of
these various buffers will be applied to solve a real‐world problem.
2) Geoprocessing operations
A geoprocessing operation is not an analysis method in itself. Geoprocessing basically means the
processing (handling) of spatial data.
1) Overlay operations
Various basic overlay processes are available for vector datasets, namely point‐in‐polygon, polygon‐on‐
point, line‐on‐line, line‐in‐polygon, polygon‐on‐line and polygon‐on‐polygon. Overlays also use Boolean
operators (AND, OR and XOR) to create an intersection, union, symmetrical difference or identity. Please
9
GGH3703: Introduction to GIS
2023
note that these outputs should not be confused with the Boolean operators used when doing attribute
queries.
2) Spatial joins
You have already been introduced to joins in learning unit 3 when you combined a spatial data layer
with an attribute data layer. A spatial join can also combine two spatial features based on the spatial
relationship between the two layers.
6. Vector errors
We need to be aware of possible errors that may arise during the analysis process or errors that are
present that may lead to making the wrong decisions.
In this practical activity, you will learn how to calculate the lengths and areas of spatial features.
Complete this self‐assessment test on the module site before you continue with the rest of the
practical activities. The self‐assessment will also help you to prepare for the Assignments.
In this practical activity, you will learn how to create vector buffers for point, line and area features.
In the practical activities you completed in learning unit 3, you captured attribute data and joined
attribute data to spatial data. In this practical activity, you will also learn how to transform this data
into information by doing attribute queries.
Complete this self‐assessment test on the module site before you continue with the rest of the
practical activities. The self‐assessment will also help you to prepare for the Assignments.
10
GGH3703: Introduction to GIS
2023
In this practical activity, you will learn how vector selections and overlays are done using QGIS.
A video to explain the difference between overlay(Intersect) and spatial selection (intersect) is
available under Additional resources for learning unit 4 (Note: This video runs quite fast, so be ready
to pause the video in order to read the comments)
Complete this self‐assessment test on the module site before you continue with the rest of the
practical activities. The self‐assessment will also help you to prepare for the Assignments.
https://saylordotorg.github.io/text_essentials-of-geographic-information-systems/index.html
1) Reclassification
The reclassification of a raster layer should not be confused with the classification of a vector data layer.
2) Raster buffers
Although raster buffers are described as less accurate, they are not inaccurate per se. As discussed in
the previous learning units, a raster data model is used for data with a continuous change in value.
Raster buffers should be interpreted in this context.
11
GGH3703: Introduction to GIS
2023
Mathematical raster overlays can be done using any mathematical calculation. You could, for example,
have raster layers of rainfall over a period. A mathematical raster overlay can be performed to calculate
the total rainfall, minimum or maximum rainfall or the average rainfall.
The Boolean connectors AND, OR and XOR can be used to perform Boolean raster overlays.
Relational raster overlays employ the operators < (smaller than), > (bigger than), = (equal to), <> (not
equal to), <= (smaller than and equal to) and >= (bigger than or equal to) to perform overlays. Relational
overlays can be used, for example to find areas with an increase in soil moisture (> operator) or a
decrease in rainfall (< operator).
8. Scale of analysis
The scale of analysis indicates the scale on which an operation is performed and can incorporate any
number of analysis methods. Raster analysis can be performed on four different scales, namely local,
neighbourhood, zonal and global. The scale of analysis should not be confused with the different
analysis methods. The different scales of analysis summarise the way in which the analysis methods are
applied. If a mathematical raster overlay is applied to one raster layer on single cells, it is called a local
operation. If a mathematical analyses function is applied, for example to calculate averages for multiple
layers, a global operation is applied. This scale of analyses involves the whole extent of the raster layer.
9. Surface analysis
Study chapter 8, topic 8.3 Surface analysis:
interpolation: https://saylordotorg.github.io/text_essentials-of-geographic-information-
systems/index.html
12
GGH3703: Introduction to GIS
2023
Watershed analysis is also a terrain mapping function that includes a series of analysis methods.
Watershed analysis is very specialised and requires specialised software to be performed.
We are now working with “Step 5: Analyse the data” We will discuss this section based on the three
example scenarios as discussed in the previous lessons.
Suppose you want to build a new house for your family. This
house should be in a residential area and close to a primary
school. You would also like the house to be close to a shopping
centre and the national highway should be easily accessible.
The property should be for sale at an affordable price.
Spatial question:
The spatial question is asked in the form of a question and ends with a question mark.
Spatial criteria
The best location for a new house is where the following criteria apply:
13
GGH3703: Introduction to GIS
2023
The property should be in a residential area. The property should not cost more than R800 000. The
property should be for sale.
Determine residential areas and indicate the locations of vacant land in these residential areas that are
for sale with maximum value of R80 000. This is done using the “query by attribute” function and ‘AND’
Boolean operator
Select all the primary schools in the identified residential areas using query: Where “Type” = “Primary”
using the “query by attribute” function.
Show all the shopping centres within the identified areas using spatial selection.
Spatial question:
Spatial criteria
14
GGH3703: Introduction to GIS
2023
Field data (collar tracking GPS coordinates). You create a minimum bounding polygon using the GPS
coordinates. This polygon includes all the GPS points of the specified rhinoceros.
Hotspot areas of previous poaching incidents
Heat map will be created using the kernel density estimation (KDE) function in a GIS.
Spatial problem:
Spatial questions:
Are the number of water tanks provided in the Loloka community sufficient for their daily needs?
Where should additional water tank(s) be installed to provide in the community's daily needs effectively?
Spatial criteria
The water tanks can sufficiently provide for the daily needs of the community if:
every household has a water tank within 150 metres of their house
Additional water tank(s) to provide in the daily needs of the community should be installed:
Please note that this is an example of advanced GIS analysis that can be done. GGH3710 will discuss some
of these analyses in more detail.
The analysis of how to determine the location of the new water tanks falls outside the scope of this
module .
15
GGH3703: Introduction to GIS
2023
This learning unit should have made you aware of some of the methods we can use to obtain the
information that we need to answer specific questions. The type of question to which we want an
answer, will dictate the type of analysis tool we use. The practical activities in this learning unit should
have made you aware of the analysis techniques required to answer these questions.
From the resources you explored and the practical activities you completed, you would have noticed
that data analysis methods are not limited to the ones covered in this learning unit. However, since this
is an introductory module, the intention was to make you aware of the basic analysis functions that can
be performed in a GIS. Often, the broad categories you have dealt with in this module contain more
specialised analysis functions.
In the next learning unit, you will explore the outputs of a GIS and how to communicate the spatial
information you have just transformed.
16