0% found this document useful (0 votes)
149 views28 pages

CH 4-Querying Gis Data

The document discusses different types of queries that can be performed in a GIS: attribute queries, location/spatial queries, and selections. Attribute queries search attribute tables for specific field values, spatial queries examine the locations and relationships between features, and selections simply highlight features of interest. The document provides examples of each type of query and outlines the basic structure of an attribute query using SQL clauses like SELECT, FROM, and WHERE.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
149 views28 pages

CH 4-Querying Gis Data

The document discusses different types of queries that can be performed in a GIS: attribute queries, location/spatial queries, and selections. Attribute queries search attribute tables for specific field values, spatial queries examine the locations and relationships between features, and selections simply highlight features of interest. The document provides examples of each type of query and outlines the basic structure of an attribute query using SQL clauses like SELECT, FROM, and WHERE.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Querying GIS Data

Introduction
One of the defining characteristics of GIS is the ability to ask
questions (and get answers) in a spatial context. These questions or queries
are requests for information that are posed in a specific fashion. When a
query is performed to obtain and answer to a spatial question, the data are
accessed through the basic elements of this structure which is made up of
tables, fields, data sets, values and connections.

In the geometric representation of an object, information about its


attributes can be derived by selecting this object in a GIS. There is also the
possibility to identify an object by its geometric representation by selecting
its entry in the attribute table.

The query operator done using GIS software, through the user
interface, query functions. The selective display and retrieval of information
based on these queries are essential components of any geographic
information system (GIS).

Generally, there are two types of GIS queries: attribute and location
(spatial), but there are three basic methods for searching and querying
attribute data: (1) selection, (2) query by attribute (or select by Attribute),
and (3) query by geography or select by location (spatial query).

GIS is powerful because it can use both attribute and spatial queries
to get answers with far less effort or answer questions that would not be
practical to answer using any other method.

Selection
Selection represents the easiest way to search and query spatial data
in a GIS. Selecting features highlight those attributes of interest, both on-
screen and in the attribute table, for subsequent display or analysis. To
accomplish this, one selects points, lines, and polygons simply by using the
cursor to “point-and-click” the feature of interest or by using the cursor to
drag a box around those features. Alternatively, one can select features by
using a graphic object, such as a circle, line, or polygon, to highlight all of
those features that fall within the object. Advanced options for selecting
subsets of data from the larger dataset include creating a new selection,
selecting from the currently selected features, adding to the current
selection, and removing from the current selection.

Query by Attribute
Map features and their associated data can be retrieved via the
query of attribute information within the data tables. For example, search
and query tools allow a user to show all the census tracts that have a
population density of 500 or greater, to show all counties that are less than
or equal to 100 square kilometers, or to show all convenience stores within
1 mile of an interstate highway.

Attribute queries ask for information from the tables associated with
features or from stand-alone tables associated with the GIS. Attributes can
be numeric values, text strings, Boolean values (i.e., true or false), or dates.
This kind of query is similar to a query made to any database; however,
when using a GIS, the answers (i.e., the features related to the records
selected by the process) are highlighted on the map as well as in the table.

Specifically, SQL (Structured Query Language) is a commonly used


computer language developed to query attribute data within a relational
database management system (RDBMS), Created by IBM in the 1970s.This
language is composed of a set of commands and rules that are used to ask
questions of a database. SQL allows for the retrieval of a subset of attribute
information based on specific, user-defined criteria via the implementation
of particular language elements. More recently, the use of SQL has been
extended for use in a GIS software, like ArcGIS Desktop.

The software detects the characteristics of the underlying database


being queried and modifies the interface so SQL statements will be created
using the appropriate search terms and information. The user is not
required to understand the SQL format for the specific database being
queried.
Query by Geography (Select by location)

Query by geography, also known as a “spatial query,” allows one to


highlight particular features by examining their position relative to other
features. The answers to spatial queries are derived directly from the
location of features on a map. Information about the proximity of one
parcel to other parcels or other kinds of features, such as roads, is not
contained in an attribute table but is easily learned using a spatial query.

For example, you could ask if one or more features are located within
a certain distance of other features, are contained by another feature,
intersect other features, or possess another of the relationships defined by
spatial operators. For example, a GIS provides robust tools that allow for
the determination of the number of schools within 10 miles of a home.
Several spatial query options are available, as outlined here. Throughout
this discussion, the “target layer” refers to the feature dataset whose
attributes are selected (schools), while the “source layer” refers to the
feature dataset on which the spatial query is applied.
For example, if we were to use a state boundary polygon feature dataset to
select highways from a line feature dataset (e.g., select all the highways that
run through the state of Arkansas), the state layer is the source, while the
highway layer is the target.

Building a Basic Query


All basic queries have three parts: a source, a filter, and a
relationship. This is true of both attribute and spatial (location) queries.
The source can be a table or feature class. The filter can be an attribute
value or a shape or feature. The relationship between the source and the
filter is based on logical, comparison, or spatial operators. When creating a
query, identify which table or feature class will contain the information that
answers your question. If the question is, what parcels have a specific
commercial zoning? an attribute table for parcel features that contains a
field for zoning would likely be a source. The filter identifies what is
different about the desired items, whether those items are table records or
features. In the commercial zoning example, that characteristic would be
the code C-3, which identifies the commercial zone of interest.
The relationship between the source and filter does the work of
finding the record or features. Continuing with the example, the
relationship would be that the zoning field contains the value C-3 so the
relationship would be “equals” or =. Relationships are defined using
operators. Comparison and logical operators are applied to attribute
queries. Comparison operators include =, <>, >, >=, <=. LIKE, AND, OR, and
NOT are logical operators. In contrast to the relatively short list of operators
for attribute queries, there are more than a dozen types of spatial
operators. Intersect, Are Within a Distance Of, Contain, and Are
Contained By are some of the most common and useful ones. Spatial
queries performed using Select by Location deal with vector data and use
a shape as a filter and its relationship with features in the source layer to
answer a question.

Structure of Query by Attribute

All attribute tables in a relational database management system


(RDBMS) used for an SQL query must contain primary and/or foreign keys
for proper use. In addition to these keys, SQL implements clauses to
structure database queries. A clause is a language element that includes
the SELECT, FROM, WHERE, ORDER BY, and HAVING query statements.

• SELECT denotes what attribute table fields you wish to view.


• FROM denotes the attribute table in which the information resides.
• WHERE denotes the user-defined criteria for the attribute
information that must be met in order for it to be included in the
output set.
• ORDER BY denotes the sequence in which the output set will be
displayed.
• HAVING denotes the predicate used to filter output from the ORDER
BY clause.

While the SELECT and FROM clauses are both mandatory statements
in an SQL query, the WHERE is an optional clause used to limit the output
set. The ORDER BY and HAVING are optional clauses used to present the
information in an interpretable manner.

Personal Addresses in “ExampleTable” Attribute Table


The following is a series of SQL expressions and results when applied
to Figure "Personal Addresses in “Example-Table” Attribute Table". The title
of the attribute table is “Example-Table.” Note that the asterisk (*) denotes
a special case of SELECT whereby all columns for a given record are
selected:

SELECT * FROM Example-Table WHERE City = “Upland”


This statement returns the following:

Consider the following statement:

SELECT Last-Name FROM Example-Table WHERE State = “CA” ORDER BY


FirstName

This statement results in the following table sorted in ascending order by


the FirstName column (not included in the output table as directed by the
SELECT clause):
In addition to clauses, SQL allows for the inclusion of specific
operators to further delimit the result of query. These operators can be
relational, arithmetic, or Boolean and will typically appear inside of
conditional statements in the WHERE clause. A relational
operator employs the statements equal to (=), less than (<), less than or
equal to (<=), greater than (>), or greater than or equal to (>=). Arithmetic
operators are those mathematical functions that include addition (+),
subtraction (−), multiplication (*), and division (/). Boolean operators (also
called Boolean connectors) include the statements AND, OR, XOR
(exclusive OR), and NOT. The AND connector is used to select records from
the attribute table that satisfies both expressions. The OR connector selects
records that satisfy either one or both expressions. The XOR connector
selects records that satisfy one and only one of the expressions (the
functional opposite of the AND connector). Lastly, the NOT connector is
used to negate (or unselect) an expression that would otherwise be true.
Put into the language of probability, the AND connector is used to represent
an intersection, OR represents a union, and NOT represents a
complement. Figure "Venn Diagram of SQL Operators" illustrates the logic
of these connectors, where circles A and B represent two sets of
intersecting data. Keep in mind that SQL is a very exacting language and
minor inconsistencies in the statement, such as additional spaces, can
result in a failed query.

Venn Diagram of SQL Operators


Used together, these operators combine to provide the GIS user with
powerful and flexible search and query options. With this in mind, can you
determine the output set of the following SQL query as it is applied
to Figure "Histogram Showing the Frequency Distribution of Exam Scores"?

An example of AND, when building complex expressions, the AND logical


operator is used to combine two (or more) simple expressions together to
find a feature for which both (or more) expressions are true, meaning if an
expression is built to find State Name = ‘California’ AND State Name =
‘Colorado’, the query will look for a feature named both California AND
Colorado. AND is used to build complex expressions such as State
Population greater than or equal to 65000 AND State Area less than 125000
- which will return only states with a population greater than or equal to
65000 AND have a total area less than 125000 square miles.

SELECT LastName, FirstName, StreetNumber FROM ExampleTable WHERE


StreetNumber >= 10000 AND StreetNumber < 100 ORDER BY LastName

The following are the results:


Structure of Query by Geography (Select by location)

There are many types of spatial queries operators (Select by


Location), which are: INTERSECT, ARE WITHIN A DISTANCE OF, COMPLETELY
CONTAIN, ARE COMPLETELY WITHIN, HAVE THEIR CENTER IN, SHARE A LINE
SEGMENT, TOUCH THE BOUNDARY OF, ARE IDENTICAL TO, ARE CROSSED
BY THE OUTLINE OF, CONTAIN, ARE CONTAINED BY

INTERSECT

This is used spatial query technique to selects all features in the


target layer that share a common locale with the source layer. The
“intersect” query allows points, lines, or polygon layers to be used as both
the source and target layers (Figure below).
The highlighted blue and yellow features are selected because they intersect the red features.

ARE WITHIN A DISTANCE OF.

This technique requires the user to specify some distance value,


which is then used to buffer (Geospatial Analysis: Vector Operations", and
"Multiple Layer Analysis") the source layer. All features that intersect this
buffer are highlighted in the target layer. The “are within a distance of”
query allows points, lines, or polygon layers to be used for both the source
and target layers (Figure below)

The highlighted blue and yellow features are selected because they are within the selected
distance of the red features; tan areas represent buffers around the various features.

Source: Red Target: Blue and yellow (selected)


COMPLETELY CONTAIN.

This spatial query technique to selects those features (target) that


are entirely within the source layer. Features with coincident boundaries
are not selected by this query type. The “completely contain” query allows
for points, lines, or polygons as the source layer, but only polygons can be
used as a target layer (Figure below)

The highlighted blue and yellow features are selected because they completely contain the red
features.

ARE COMPLETELY WITHIN.

This query selects those features in the target layer whose entire
spatial extent occurs within the geometry of the source layer. The “are
completely within” query allows for points, lines, or polygons as the target
layer, but only polygons can be used as a source layer (Figure below)
The highlighted blue and yellow features are selected because they are completely within the red

features.

HAVE THEIR CENTER IN.

This technique selects target features who’s center, or centroid, is


located within the boundary of the source feature dataset. The “have their
center in” query allows points, lines, or polygon layers to be used as both
the source and target layers (Figure below)
The highlighted blue and yellow features are selected because they have their centers in the red

features.
SHARE A LINE SEGMENT.

This spatial query selects target features whose boundary


geometries share a minimum of two adjacent vertices with the source layer.
The “share a line segment” query allows for line or polygon layers to be
used for either of the source and target layers (Figure below)

The highlighted blue and yellow features are selected because they share a line segment with the

red features.
TOUCH THE BOUNDARY OF.

This methodology is similar to the INTERSECT spatial query; however,


it selects line and polygon features that share a common boundary with
target layer. The “touch the boundary of” query allows for line or polygon
layers to be used as both the source and target layers (Figure below)

The highlighted blue and yellow features are selected because they touch the boundary of the red

features.
ARE IDENTICAL TO.

This spatial query returns features that have the exact same
geographic location. The “are identical to” query can be used on points,
lines, or polygons, but the target layer type must be the same as the source
layer type (Figure below)

The highlighted blue and yellow features are selected because they are identical to the red

features.

ARE CROSSED BY THE OUTLINE OF.

This selection criteria returns features that share a single vertex but
not an entire line segment. The “are crossed by the outline of” query allows
for line or polygon layers to be used as both source and target layers (Figure
below)
The highlighted blue and yellow features are selected because they are crossed by the outline of

the red features.


CONTAIN.

This method is similar to the COMPLETELY CONTAIN spatial query;


however, features in the target layer will be selected even if the boundaries
overlap. The “contain” query allows for point, line, or polygon features in
the target layer when points are used as a source; when line and polygon
target layers with a line source; and when only polygon target layers with a
polygon source (Figure below)
The highlighted blue and yellow features are selected because they contain the red features.
ARE CONTAINED BY.

This method is similar to the ARE COMPLETELY WITHIN spatial query;


however, features in the target layer will be selected even if the boundaries
overlap. The “are contained by” query allows for point, line, or polygon
features in the target layer when polygons are used as a source; when point
and line target layers with a line source; and when only point target layers
with a point source (Figure below)
The highlighted blue and yellow features are selected because they are contained by the red

features.
Using Operators

When selecting by attribute, LIKE and NOT are often used to compare
text strings and match patterns, often employing wildcards (Alternatively
referred to as a wild character or wildcard character, a wildcard is a symbol
used to replace or represent one or more characters. The most common
wildcards are the asterisk (*), which represents one or more characters and
question mark (?) that represents a single character). Wildcards can replace
one character or a group of characters that are unknown. The character
used depends on the data source being queried. For personal
geodatabases, use a question mark (?) for a single character and an asterisk
(*) for a group of characters. For shapefiles, ArcSDE geodatabase feature
classes, and other types of data, use an underscore (_) to replace a single
unknown character and a percentage sign (%) to replace a group of
characters. ArcMap will detect the type of database being queried and
adjust the wildcard characters available through the Select by Attribute
dialog box. LIKE is a good operator for finding text strings that contain
variant spellings or possibly misspellings, of a text string. Rather than using
an equal sign followed by the search term enclosed in quotes, use LIKE and
enclose the search term and a wildcard character with quotes. Like the
characters used for wildcards, the syntax used when querying dates
depends on the underlying database. ArcMap automatically writes the
proper syntax when you double-click on a date value in the Unique Values
list of the Select by Attribute dialog box. The choice of spatial operator (i.e.,
the relationship tested by the query) depends on the types of features that
will be used for the source and filter.

Answering More Complex Questions

To meet multiple search criteria, several attribute queries can be


combined using logical operators (such as AND, OR, LIKE, and NOT) to find
records based on several criteria in two or more attribute fields. Remember
that OR is the far more flexible and inclusive operator. When using AND,
both conditions must be true to return records. For queries that use OR,
only one condition must be true to return records. Ordinarily, queries are
evaluated from left to right. However, any portion of the query enclosed in
parentheses is evaluated first. The order of operations can be important
both in obtaining a valid answer to the question being asked and in
optimizing the way a query runs. A single query such as “LOT_SIZE” >= 1
AND “LOT_SIZE” <= 2 AND “SLOPE” < 5 could be used to locate parcels with
lots between one and two acres in size that have a slope of less than 5
percent. For more complex spatial queries, subqueries can be used to sieve
data. The results of one query can be used as the basis for additional
queries related to the currently selected features, either selecting from
those features, adding to those features, or removing features from the
selected set. These operations work in much the same way as AND and OR
operators by creating subsets that will be the basis for additional selections.

Working with Both Query Types

Although attribute and spatial queries can work together in locating


the desired information, because they are entered in ArcMap in different
dialog boxes (Select by Attribute, Select by Location), these two different
kinds of queries must be posed separately. To expand the previous
example (and narrow the search), the desired parcels must have sold within
the last year, be zoned C-3, and be located within five miles of a specific
parcel in addition to being between one and two acres in size with a slope
of less than 5 percent. The date sold and zoning information are also stored
in the attribute table but identifying the parcels within five miles of the
subject parcel will require a spatial query. Limiting the parcel search to
those that fall into the five-mile buffer around the subject parcel will
eliminate many records, so that query is performed first. Using the features
selected by location, an attribute query that evaluates the sale date to limit
the selection, then the zoning, lot size, and slope will identify features that
fulfill all criteria.

You might also like