GIS Notes v2
GIS Notes v2
GIS is a computer system for capturing, storing, checking and displaying data related to position on
earth’s surface. Helps with understanding spatial patterns and relationships. It provides with following
four sets of capabilities to handle geo referenced data:
1. Data capture and preparation
a. Data capture is tedious job in GIS. A GIS can be used to emphasize the spatial
relationships among the objects being mapped. If the data to be used are not already
in digital form that is in a form a computer can understand and recognize, various
techniques are available to capture the information. Maps can be digitized, or hand
traced with a computer mouse, to collect the coordinates of the features. Electronic
scanning devices will also convert map lines and points to digits.
2. Data management
a. This phase requires a decision to be made on how best to represent our data, in terms
of soft heir spatial properties and the various attribute values which we need to store.
Data Manipulation includes data verification, attributes data management,
insertion, updating, deleting and retrieval in different forms.
3. Data manipulation and analysis
a. Data analysis can be done, when data has been collected and organized in computer
system.
4. Data presentation
a. After the data manipulation, our data is prepared for producing output. This phase
deals with putting it all together into a format that communicates the result of data
analysis in the best possible way.
5. Application of GIS –
a. A biologist might be interested in the impact of slash-and-burn practices on the
populations of amphibian species in the forests of a mountain range to obtain a better
understanding of long-term threats to those populations;
b. A natural hazard analyst might like to identify the high-risk areas of annual
monsoon-related flooding by investigating rainfall patterns and terrain characteristics;
c. A geological engineer might want to identify the best localities for construct- ing
buildings in an earthquake-prone area by looking at rock formation characteristics;
d. A mining engineer could be interested in determining which prospective copper
mines should be selected for future exploration, taking into account parameters such
as extent, depth and quality of the ore body, among others;
e. A Geoinformatics engineer hired by a telecommunications company may want to
determine the best sites for the company’s relay stations, taking into ac- count various
cost factors such as land prices, undulation of the terrain etc.
f. A forest manager might want to optimize timber production using data on soil and
current tree stand distributions, in the presence of a number of operational constraints,
such as the need to preserve species diversity in the area.
g. A hydrological engineer might want to study a number of water quality parameters
of different sites in a freshwater lake to improve understanding of the current
distribution of Typha reed beds, and why it differs from that of a decade ago.
GI System, GI Science and GI Applications
1. A GIS system is a type of database containing geographic data combined with software tools
for managing, analysing and visualising those data.
a. Commercial programmes -> ESRI, ArcGIS, AutoDesk
b. Open source -> QGIS, GRASS GIS
2. GI Science is the scientific field that attempts to integrate different disciplines studying the
methods and techniques of handling spatial information. Geographic information science is
the scientific discipline that studies geographic information, including how it represents
phenomena in the real world, how it represents the way humans understand the world, and
how it can be captured, organized and analysed.
3. GIS Application – A clear cut purpose, and these applications can be short lived: the research
is carried out by collecting data, entering data in the GIS< analysing the data, and producing
informative maps. An example is rapid earthquake damage assessment.
Spatial Data: contains positional values, such as (x,y ) co-ordinates.
Key components of spatial data -
1. Quality include positional accuracy (vertical and horizontal)
2. Temporal accuracy (data is up to date)
a. Geographic phenomena are also dynamic so they change over time.
Examples of the kinds of questions involving time include:
i. Where and when did something happen?
ii. How fast did this change occur?
iii. In which order did the changes happens
b. Spatiotemporal data models are ways of organizing representations
of space and time in a GIS. Different concepts of time
i. Discrete and continuous time
ii. Valid time and transaction time
iii. Linear, branching and cycling time
iv. Time granularity
v. Absolute and relative time
3. Attribute accuracy (Labelling of feature)
4. Lineage (history of the data including the sources)
5. Completeness (set presents all related features of reality)
6. Logical consistency (data is logically structured)
Geo-Spatial Data: Used to as a further refinement, which refers to spatial data that is geo referenced.
Spatial data is also known as geo referenced data.
Geo-information – It is a specific type of information resulting from the interpretation of spatial data
Modelling - ‘Modelling’ is a term used in many different ways and which has many different
meanings. A representation of some part of the real world can be considered a model because the
representation will have certain characteristics in common with the real world. Specifically, those
which we have identified in our model design. This then allows us to study and operate on the model
itself instead of the real world in order to test what happens under various conditions, and help us
answer ‘what if’ questions. Models—as representations—come in many different flavours. In the GIS
environment, the most familiar model is that of a map. A map is a miniature representation of some
part of the real world. Paper maps are the most common, but digital maps also exist. Databases are
another important class of models. A database can store a considerable amount of data, and also
provides various functions to operate on the stored data. The collection of stored data represents some
real-world phenomena, so it too is a model. Obviously, here we are especially interested in databases
that store spatial data. Digital models (as in a database or GIS) have enormous advantages over paper
models (such as maps). They are more flexible, and therefore more easily changed for the purpose at
hand. In principle, they allow animations and simulations to be carried out by the computer system.
This has opened up an important toolbox that can help to improve our understanding of the world.
Most maps and databases can be considered static models. At any point in time, they represent a
single state of affairs. Usually, developments or changes in the real world are not easily recognized in
these models. Dynamic models or - process models address precisely this issue. They emphasize
changes that have taken place, are taking place or may take place sometime in the future. Dynamic
models are inherently more complicated than static models, and usually require much more
computation. Simulation models are an important class of dynamic models that allow the simulation
of real-world processes.
Modelling is the process of producing an abstraction of the ‘real world’ so that some part of it
can be more easily handled.
Dynamic Model -> inherently more complicated than static models.
Simulation Models – are an important class of dynamic models that allow the simulation of real-
world processes.
Maps -> Their conception and design has developed into a science with a high degree of
sophistication. A disadvantage of the traditional paper map is that it is generally restricted to 2 D static
representations and is fixed scale.
Databases
A database is a repository for storing large amounts of data. It comes
with a number of useful functions:
1. A database can be used by multiple users at the same time—i.e., it
allows concurrent use,
2. A database offers a number of techniques for storing data and allows
the use of the most efficient one—i.e., it supports storage optimization,
3. A database allows the imposition of rules on the stored data; rules that will be automatically
checked after each update to the data—i.e.
it supports data integrity,
4. A database offers an easy-to-use data manipulation language, which allows the execution of all
sorts of data extraction and data updates—i.e., it has a query facility,
5. A database will try to execute each query in the data manipulation
language in the most efficient way—i.e., it offers query optimization.
Databases can store almost any kind of data in different forms like tables etc.
Spatial Databases and Spatial Analysis
A spatial database is a general-purpose database (usually a relational database) that has been enhanced
to include spatial data that represents objects defined in a geometric space, along with tools for
querying and analyzing such data. The SQL/MM Spatial ISO/IEC standard is a part the SQL/MM
multimedia standard and extends the Simple Features standard with data types that support circular
interpolations. A geodatabase (also geographical database and geospatial database) is a database of
geographic data, such as countries, administrative divisions, cities, and related information. Such
databases can be useful for websites that wish to identify the locations of their visitors for
customization purposes. A geodatabase is not the same thing as a GIS, though both systems share a
number of characteristics. These include the functions listed above for databases in general:
concurrency, storage, integrity, and querying, specifically, but not only, spatial data.
Geographic Field – A field is a geographic phenomenon for which, for every point in the study
area, a value can be determined. Data types and values
1. Nominal Data Values
2. Ordinal Data Values
3. Interval Data Values
4. Ratio Data Values
Topological And spatial Relationships
Boundaries – Where shape and/or size of contiguous areas matter, the notion of boundary comes
into play. This is true for geographic objects but also for the constitutents of a discrete
geographic field.
Regular Tessellations
Irregular Tessellations
UNIT 2
**Explain the various reasons for using DBMS in GIS
1. A DBMS supports the storage and manipulation of very large data sets
a. Storing data in text files or spreadsheet is super inconvenient
2. Guard over data Correctness
a. Does not contain obvious errors
b. Range of possible geographic coordinates, so we can ensure the DBMS checks
them
c. Integrity constraints that can be defined in and automatically checked by a
DBMS
d. More complex integrity constraints are certainly possible, and their definition is
part of the design of database
3. A DBMS supports the concurrent use of the same data for many users
a. Large data sets are built over time, which means that substantial investments are
required to create and manage them, and that probably many people are
involved in the data collection, maintenance and processing
b. DBMS function is called concurrency control
4. A DBMS supports high level, declarative query language
a. Language is the definition of queries
b. A query is a computer program that extracts data from the database that meet
the conditions indicated in the query.
5. A DBMS supports the use of a data model
a. Data model is language with which one can define a database structure and
manipulate the data stored in it
6. DBMS includes data backup and recovery functions to ensure data availability at all
times
a. As potentially many users rely on the availability of the data, the data must be
safeguarded against possible calamities. Regular backups of the data set,
automatic recovery schemes provide an assurance against loss of data
7. Allows the control of data redundancy
a. Storing a fact multiple times give rise to a phenomenon known as data
redundancy.
b. Data redundancy can lead to situations in which stored facts may contradict
each other, causing reduced usefulness of the data.
c. Redundancy, however, is not necessarily always problematic, as long as we
specify where it occurs so that it can be controlled for.
8. Process of linking GIS with DBMS
a. Storing spatial and attribute Data
i. GIS software provides support for spatial data and the Matic or attribute data.
ii. GISs have traditionally stored spatial data and attribute data separately
iii. This required the GIS to provide a link between the spatial data that is
represented with raster’s or vectors, and their non-spatial attribute data.
iv. Geographic information systems are strong because they have built-in
capabilities for analysing, storing, and producing maps that are derived from
their understanding of geographical space.
v. GIS packages themselves can store tabular data, however, they do not always
provide a full-fledged query language to operate on the tables.
b. External DBMS
i. DBMS serves as a centralised data repository for all users, while each user
runs her/his own GIS software that obtains its data from the DBMS
c. Linking Objects and tables
i. With raster representation, each raster cell stores a characteristic value
ii. With vector representations, our spatial objects, whether they are points, lines
or polygons are automatically given a unique identifier by the system
iii.
Vector data, Raster Data, Polygon Data
1. Vector Data: Vector data represents geographic features using points, lines, and
polygons. These features are defined by their spatial coordinates (X, Y or longitude,
latitude) and are typically used to represent discrete and distinct objects on the Earth's
surface. Examples of vector data include points representing cities, lines representing
roads, and polygons representing land parcels. Vector data stores attributes associated
with each feature, such as population, name, or elevation.
2. Raster Data: Raster data represents geographic features using a grid of cells or pixels.
Each cell has a value that represents a specific attribute or phenomenon. Raster data is
used to represent continuous and continuous-like phenomena, such as elevation,
temperature, or satellite imagery. It is structured as a grid where each cell represents a
small portion of the Earth's surface. The resolution of the raster determines the size and
level of detail of each cell.
3. Polygon: In GIS, a polygon is a type of vector geometry that represents an enclosed area
with a defined boundary. It is defined by a series of interconnected vertices or nodes,
forming a closed shape. Polygons are commonly used to represent areas such as
countries, states, or land parcels. They can have attributes associated with them, making
them useful for spatial analysis and querying.
4.
UNIT 4
Automatic Classification
User-controlled classifications require a classification table or user interaction. GIS software
can also perform automatic classification, in which a user only specifies the number of classes in
the output data set. The system automatically determines the class break points. Two main
techniques of determining break points are in use.
1. Equal interval technique: The minimum and maximum values vmin and vmax of the classification
parameter are determined and the (constant) interval size for each category is calculated as (vmax −
vmin)/n, where n is the number of classes chosen by the user. This classification is useful in revealing
the distribution patterns as it determines the number of features in each category.
2. Equal frequency technique: This technique is also known as quantile classification. The objective is
to create categories with roughly equal numbers of features per category. The total number of features
is determined first and by the required number of categories, the number of features per category is
calculated. The class break points are then determined by counting off the features in order of
classification parameter value.
Vector Overlay Operators
In the vector domain, overlay is computationally more demanding than in the raster domain. Here we
will only discuss overlays from polygon data layers, but we note that most of the ideas also apply to
overlay operations with point or line data layers.
The standard overlay operator for two layers of polygons is the polygon intersection operator. It is
fundamental, as many other overlay operators proposed in the literature or implemented in systems
can be defined in terms of it. The result of this operator is the collection of all possible polygon
intersections; the attribute table result is a join—in the relational database sense.
Neighbourhood functions
The principle here is to find out the characteristics of the vicinity, here called neighbourhood, of a
location. After all, many suitability questions, for instance, depend not only on what is at the location,
but also on what is near the location.
1. State which target locations are of interest to us, and define their spatial extent,
2. Define how to determine the neighbourhood for each target,
3. Define which characteristic(s) must be computed for each neighbourhood.
Neighbourhood functions
The basic functions that fall under this domain are:
1. Average
2. Diversity
3. Minimum/Maximum and
4. Total
The parameters that need to be defined to operate these functions are:
1. Target locations
2. Specification of neighbourhood
3. Function to be performed on neighbourhood elements
4. Search operation is one of the most common neighbourhood functions
5. Neighbourhood function on a vector model is a specialised search function while on a
raster model, polygons are on a separate layer and points and lines are on a separate layer.
6. Theissen polygon operation
UNIT 5
Bertins six cateogries of visual variables
1. Size
2. Value (lightness)
3. Texture
4. Colour
5. Orientation
6. Shape
MAP COSMETICS
1. Each map should have, next to the map image, a title, informing the user about the topic
visualized. A legend is necessary to understand how the topic is depicted.
2. Additional marginal information to be found on a map is a scale indicator, a north arrow
for orientation, the map datum and map projection used and some lineage
information(such as data sources, dates of data collection, methods used etc).
3. Further information can be added that indicates when the map was issued and by whom
(author/ publisher). All this information allows the user to obtain an impression of the
quality of the map and is comparable with metadata describing the contents of a
database or data layer.
4. On paper maps, these elements have to appear next to the map and is comparable with
metadata describing the contents of a database or data layer.
5. Maps presented on screen often go without marginal information, partly because of
space constraints. How-ever on-screen maps are often interactive and clicking on a map
element may reveal additional information from the database. Legends and titles are
often available on demand as well.
6. Text is used to transfer information in addition to the symbols used. This can be done by
the application of the visual variables to the text as well.
7. Common example is the use of colour to differentiate between hydrographic names(in
blue) and other names(in black). The text should also be placed in a proper position with
respect to the object to which it refers.
8. The design aspect of creating appealing maps also has to be included in the visualization
process. ‘Appealing’ does not only mean having nice colours. One of the keywords here
is ‘contrast’.
9. Contrast will increase the communicative role of the map since it creates a hierarchy in
the map contents, assuming that not all information has equal importance. This design
trick is known as visual hierarchy or the figure ground concept.
MAP DISSEMINATION
1. The map design will not only be influenced by the nature of the data to be mapped or
the intended audience (the ‘what’ and ‘whom’ from “how do I say what to whom and its
effective”), the output medium also plays a role. Traditionally, maps were produced on
paper and many still are.
2. Compared to maps on paper, on-screen maps have to be smaller and therefore their
contents should be carefully selected. This might seem a disadvantage, but presenting
maps on-screen offers very interesting alternatives.
3. A mouse click could also open the link to a database and reveal much more information
than a paper map could ever offer. Links to other than tabular or map data could also be
made available.
4. Maps and multimedia(photography, sound, video, animation) can be integrated. Some of
today’s electronic atlases such as the Encarta world atlas are good examples of how
multimedia elements can be integrated with the map.
5. Pointing to a country on a world map starts the national anthem of the country or shows
its flag. It can be used to explore a country’s language, moving the mouse would start a
short sentence in the region’s dialects.
6. The WWW is nowadays a common medium used to present and disseminate spatial
data. Here maps can play their traditional role, for instance to show the location of
objects or provide insight in to spatial patterns, but because of the nature of the
internet, the map can also function as an interface to additional information.
7. Maps can also be used as ‘previews’ of spatial data products to be acquired through a
spatial data clearing house that is part of a spatial data infrastructure. For that purpose
we can make use of geo-webservices which can provide interactive map views as
intermediate between data and web browser.
8. The WWW also allows for the fully interactive presentation of 3D models. The virtual
reality markup language (VRML). For instance can be used for this purpose. It stores a
true 3D model of the objects not just a series of 3D views.