0% found this document useful (0 votes)
68 views22 pages

Representing Geography: Geographic Information Systems and Science, 2nd Edition

Uploaded by

Angelo Leica
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views22 pages

Representing Geography: Geographic Information Systems and Science, 2nd Edition

Uploaded by

Angelo Leica
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

3 Representing geography

This chapter introduces the concept of representation, or the construction of


a digital model of some aspect of the Earth’s surface. Representations have
many uses, because they allow us to learn, think, and reason about places
and times that are outside our immediate experience. This is the basis of
scientific research, planning, and many forms of day-to-day problem solving.
The geographic world is extremely complex, revealing more detail the
closer one looks, almost ad infinitum. So in order to build a representation
of any part of it, it is necessary to make choices, about what to represent,
at what level of detail, and over what time period. The large number of
possible choices creates many opportunities for designers of GIS software.
Generalization methods are used to remove detail that is unnecessary for an
application, in order to reduce data volume and speed up operations.

Geographic Information Systems and Science, 2nd edition Paul Longley, Michael Goodchild, David Maguire, and David Rhind.
 2005 John Wiley & Sons, Ltd. ISBNs: 0-470-87000-1 (HB); 0-470-87001-X (PB)
64 PART II PRINCIPLES

Learning Objectives Sometimes this knowledge is used as a substitute for


directly sensed information, creating a virtual reality (see
Section 11.3.1). Increasingly it is used to augment what
After reading this chapter you will know: we can see, touch, hear, feel, and smell, through the use
of mobile information systems that can be carried around.
Our knowledge of the Earth is not created entirely
■ The importance of understanding freely, but must fit with the mental concepts we began
representation in GIS; to develop as young children – concepts such as con-
tainment (Paris is in France) or proximity (Dallas and
■ The concepts of fields and objects and their Fort Worth are close). In digital representations, we for-
malize these concepts through data models (Chapter 8),
fundamental significance; the structures and rules that are programmed into a GIS
to accommodate data. These concepts and data models
■ Raster and vector representation and how together constitute our ontologies, the frameworks that
they affect many GIS principles, techniques, we use for acquiring knowledge of the world.
and applications; Almost all human activities require knowledge
about the Earth – past, present, or future.
■ The paper map and its role as a GIS product
One such ontology, a way to structure knowledge of
and data source;
movement through time, is a three-dimensional diagram,
in which the two horizontal axes denote location on the
■ The importance of generalization methods Earth’s surface, and the vertical axis denotes time. In
and the concept of representational scale; Figure 3.1, the daily lives of a sample of residents of
Lexington, Kentucky, USA are shown as they move by
■ The art and science of representing car through space and time, from one location to another,
while going about their daily business of shopping,
real-world phenomena in GIS. traveling to work, or dropping children at school. The
diagram is crude, because each journey is represented
by a series of straight lines between locations measured
with GPS, and if we were able to examine each track
or trajectory in more detail we would see the effects
of having to follow streets, stopping at traffic lights, or
3.1 Introduction slowing for congestion. If we looked even closer we
might see details of each person’s walk to and from
the car. Each closer perspective would display more
We live on the surface of the Earth, and spend most information, and a vast storehouse would be required to
of our lives in a relatively small fraction of that space. capture the precise trajectories of all humans throughout
Of the approximately 500 million square kilometers of even a single day.
surface, only one third is land, and only a fraction of that The real trajectories of the individuals shown in
is occupied by the cities and towns in which most of us Figure 3.1 are complex, and the figure is only a represen-
live. The rest of the Earth, including the parts we never tation of them – a model on a piece of paper, generated
visit, the atmosphere, and the solid ground under our feet, by a computer from a database. We use the terms rep-
remains unknown to us except through the information resentation and model because they imply a simplified
that is communicated to us through books, newspapers, relationship between the contents of the figure and the
television, the Web, or the spoken word. We live lives database, and the real-world trajectories of the individ-
that are almost infinitesimal in comparison with the 4.5 uals. Such representations or models serve many useful
billion years of Earth history, or the over 10 billion years purposes, and occur in many different forms. For example,
since the universe began, and know about the Earth before representations occur:
we were born only through the evidence compiled by
geologists, archaeologists, historians, etc. Similarly, we ■ in the human mind, when our senses capture
know nothing about the world that is to come, where we information about our surroundings, such as the
have only predictions to guide us. images captured by the eye, or the sounds captured by
Because we can observe so little of the Earth directly, the ear, and memory preserves such representations
we rely on a host of methods for learning about its for future use;
other parts, for deciding where to go as tourists or
shoppers, choosing where to live, running the operations ■ in photographs, which are two-dimensional models of
of corporations, agencies, and governments, and many the light emitted or reflected by objects in the world
other activities. Almost all human activities at some into the lens of a camera;
time require knowledge (Section 1.2) about parts of the ■ in spoken descriptions and written text, in which
Earth that are outside our direct experience, because they people describe some aspect of the world in language,
occur either elsewhere in space, or elsewhere in time. in the form of travel accounts or diaries; or
CHAPTER 3 REPRESENTING GEOGRAPHY 65

Figure 3.1 Schematic representation of the daily journeys of a sample of residents of Lexington, Kentucky, USA. The horizontal
dimensions represent geographic space and the vertical dimension represents time of day. Each person’s track plots as a
three-dimensional line, beginning at the base in the morning and ending at the top in the evening. (Reproduced with permission of
Mei-Po Kwan)

■ in the numbers that result when aspects of the world


are measured, using such devices as thermometers, 3.2 Digital representation
rulers, or speedometers.
By building representations, we humans can assemble This book is about one particular form of representa-
far more knowledge about our planet than we ever could tion that is becoming increasingly important in our soci-
as individuals. We can build representations that serve ety – representation in digital form. Today, almost all
such purposes as planning, resource management and communication between people through such media as the
conservation, travel, or the day-to-day operations of a telephone, FAX, music, television, newspapers and mag-
parcel delivery service. azines, or email is at some time in its life in digital form.
Information technology based on digital representation is
Representations help us assemble far more moving into all aspects of our lives, from science to com-
knowledge about the Earth than is possible on merce to daily existence. Almost half of all households
our own. in some industrial societies now own at least one power-
ful digital information processing device (a computer); a
Representations are reinforced by the rules and laws large proportion of all work in offices now occurs using
that we humans have learned to apply to the unobserved digital computing technology; and digital technology has
world around us. When we encounter a fallen log in a invaded many devices that we use every day, from the
forest we are willing to assert that it once stood upright, microwave oven to the automobile.
and once grew from a small shoot, even though no one One interesting characteristic of digital technology is
actually observed or reported either of these stages. We that the representation itself is rarely if ever seen by the
predict the future occurrence of eclipses based on the user, because only a few technical experts ever see the
laws we have discovered about the motions of the Solar individual elements of a digital representation. What we
System. In GIS applications, we often rely on methods see instead are views, designed to present the contents of
of spatial interpolation to guess the conditions that exist the representation in a form that is meaningful to us.
in places where no observations were made, based on The term digital derives from digits, or the fingers,
the rule (often elevated to the status of a First Law and our system of counting based on the ten digits of
of Geography and attributed to Waldo Tobler) that all the human hand. But while the counting system has
places are similar, but nearby places are more similar than ten symbols (0 through 9), the representation system in
distant places. digital computers uses only two (0 and 1). In a sense,
then, the term digital is a misnomer for a system that
Tobler’s First Law of Geography: Everything is represents all information using some combination of the
related to everything else, but near things are two symbols 0 and 1, and the more exact term binary is
more related than distant things. more appropriate. In this book we follow the convention
66 PART II PRINCIPLES
of using digital to refer to electronic technology based on The Internet, for example, operates on the basis of packets
binary representations. of information, consisting of strings of 0s and 1s, which
are sent through the network based on the information
Computers represent phenomena as binary digits. contained in the packet’s header. The network needs to
Every item of useful information about the Earth’s know only what the header means, and how to read the
surface is ultimately reduced by a GIS to some instructions it contains regarding the packet’s destination.
combination of 0s and 1s. The rest of the contents are no more than a collection
of bits, representing anything from an email message to
Over the years many standards have been developed for a short burst of music or highly secret information on
converting information into digital form. Box 3.1 shows its way from one military installation to another, and are
the standards that are commonly used in GIS to store almost never examined or interpreted during transmission.
data, whether they consist of whole or decimal numbers This allows one digital communications network to serve
or text. There are many competing coding standards for every need, from electronic commerce to chatrooms, and
images and photographs (GIF, JPEG, TIFF, etc.) and for it allows manufacturers to build processing and storage
movies (e.g., MPEG) and sound (e.g., MIDI, MP3). Much technology for vast numbers of users who have very
of this book is about the coding systems used to represent different applications in mind. Compare this to earlier
geographic data, especially Chapter 8, and as you might ways of communicating, which required printing presses
guess that turns out to be comparatively complicated. and delivery trucks for one application (newspapers) and
Digital technology is successful for many reasons, not networks of copper wires for another (telephone).
the least of which is that all kinds of information share a Digital representations of geography hold enormous
common basic format (0s and 1s), and can be handled in advantages over previous types – paper maps, written
ways that are largely independent of their actual meaning. reports from explorers, or spoken accounts. We can use

Technical Box 3.1

The binary counting system


The binary counting system uses only two sym- 2 is assigned ASCII code 48 (00110000 in
bols, 0 and 1, to represent numerical informa- binary), and the number 5 is 53 (00110101),
tion. A group of eight binary digits is known as so if 25 were coded as two characters using
a byte, and volume of storage is normally mea- 8-bit ASCII its digital representation would
sured in bytes rather than bits (Table 1.1). There be 16 bits long (0011000000110101). The
are only two options for a single digit, but there characters 2 = 2 would be coded as 48, 61, 48
are four possible combinations for two digits (001100000011110100110000). ASCII is used for
(00, 01, 10, and 11), eight possible combinations coding text, which consists of mixtures of letters,
for three digits (000, 001, 010, 011, 100, 101, 110, numbers, and punctuation symbols.
111), and 256 combinations for a full byte. Dig- Numbers with decimal places are coded
its in the binary system (known as binary digits, using real or floating-point representations. A
or bits) behave like digits in the decimal system number such as 123.456 (three decimal places
but using powers of two. The rightmost digit and six significant digits) is first transformed by
denotes units, the next digit to the left denotes powers of ten so that the decimal point is in a
twos, the next to the left denotes fours, etc. standard position, such as the beginning (e.g.,
For example, the binary number 11001 denotes 0.123456 × 103 ). The fractional part (0.123456)
one unit, no twos, no fours, one eight, and and the power of 10 (3) are then stored in
one sixteen, and is equivalent to 25 in the nor- separate sections of a block of either 4 bytes
mal (decimal) counting system. We call this the (32 bits, single precision) or 8 bytes (64 bits,
integer digital representation of 25, because it double precision). This gives enough precision
represents 25 as a whole number, and is readily to store roughly 7 significant digits in single
amenable to arithmetic operations. Whole num- precision, or 14 in double precision.
bers are commonly stored in GIS using either Integer, ASCII, and real conventions are
short (2-byte or 16-bit) or long (4-byte or 32-bit) adequate for most data, but in some cases it
options. Short integers can range from −65535 is desirable to associate images or sounds with
to +65535, and long integers from −4294967295 places in GIS, rather than text or numbers. To
to +4294967295. allow for this GIS designers have included a BLOB
The 8-bit ASCII (American Standard Code option (standing for binary large object), which
for Information Interchange) system assigns simply allocates a sufficient number of bits to
codes to each symbol of text, including letters, store the image or sound, without specifying
numbers, and common symbols. The number what those bits might mean.
CHAPTER 3 REPRESENTING GEOGRAPHY 67
the same cheap digital devices – the components of PCs, and distributed, and for the first time it became possible
the Internet, or mass storage devices – to handle every to imagine that something could be known by every
type of information, independent of its meaning. Digital human being – that knowledge could be the common
data are easy to copy, they can be transmitted at close to property of humanity. Only one major restriction affected
the speed of light, they can be stored at high density in what could be distributed using this new mechanism:
very small spaces, and they are less subject to the physical the representation had to be flat. If one were willing
deterioration that affects paper and other physical media. to accept that constraint, however, paper proved to be
Perhaps more importantly, data in digital form are easy to enormously effective; it was cheap, light and thus easily
transform, process, and analyze. Geographic information transported, and durable. Only fire and water proved to
systems allow us to do things with digital representations be disastrous for paper, and human history is replete with
that we were never able to do with paper maps: to instances of the loss of vital information through fire
measure accurately and quickly, to overlay and combine, or flood, from the burning of the Alexandria Library in
and to change scale, zoom, and pan without respect to the 7th century that destroyed much of the accumulated
map sheet boundaries. The vast array of possibilities for knowledge of classical times to the major conflagrations
processing that digital representation opens up is reviewed of London in 1666, San Francisco in 1906, or Tokyo
in Chapters 14 through 16, and is also covered in the in 1945, and the flooding of the Arno that devastated
applications that are distributed throughout the book. Florence in 1966.
One of the most important periods for geographic
Digital representation has many uses because of representation began in the early 15th century in Portugal.
its simplicity and low cost. Henry the Navigator (Box 3.2) is often credited with
originating the Age of Discovery, the period of European
history that led to the accumulation of large amounts of
information about other parts of the world through sea
voyages and land explorations. Maps became the medium
for sharing information about new discoveries, and for
3.3 Representation for what and administering vast colonial empires, and their value was
for whom? quickly recognized. Although detailed representations
now exist of all parts of the world, including Antarctica,
in a sense the spirit of the Age of Discovery continues
Thus far we have seen how humans are able to build in the explorations of the oceans, caves, and outer space,
representations of the world around them, but we have and in the process of re-mapping that is needed to keep up
not yet discussed why representations are useful, and with constant changes in the human and natural worlds.
why humans have become so ingenious at creating and It was the creation, dissemination, and sharing of
sharing them. The emphasis here and throughout the accurate representations that distinguished the Age of
book is on one type of representation, termed geographic, Discovery from all previous periods in human history
and defined as a representation of some part of the (and it would be unfair to ignore its distinctive negative
Earth’s surface or near-surface, at scales ranging from consequences, notably the spread of European diseases
the architectural to the global. and the growth of the slave trade). Information about
other parts of the world was assembled in the form of
Geographic representation is concerned with the maps and journals, reproduced in large numbers using the
Earth’s surface or near-surface. recently invented printing press, and distributed on paper.
Even the modest costs associated with buying copies were
Geographic representations are among the most eventually addressed through the development of free
ancient, having their roots in the needs of very early public lending libraries in the 19th century, which gave
societies. The tasks of hunting and gathering can be access to virtually everyone. Today, we benefit from what
much more efficient if hunters are able to communi- is now a longstanding tradition of free and open access
cate the details of their successes to other members of to much of humanity’s accumulated store of knowledge
their group – the locations of edible roots or game, for about the geographic world, in the form of paper-based
example. Maps must have originated in the sketches early representations, through the institution of libraries and the
people made in the dirt of campgrounds or on cave walls, copyright doctrine that gives people rights to material for
long before language became sufficiently sophisticated to personal use (see Chapter 18 for a discussion of laws
convey equivalent information through speech. We know affecting ownership and access). The Internet has already
that the peoples of the Pacific built representations of the become the delivery mechanism for providing distributed
locations of islands, winds, and currents out of simple access to geographic information.
materials to guide each other, and that very simple forms
of representation are used by social insects such as bees In the Age of Discovery maps became extremely
to communicate the locations of food resources. valuable representations of the state of
Hand-drawn maps and speech are effective media for geographic knowledge.
communication between members of a small group, but
much wider communication became possible with the It is not by accident that the list of important appli-
invention of the printing press in the 15th century. Now cations for geographic representations closely follows the
large numbers of copies of a representation could be made list of applications of GIS (see Section 1.1 and Chapter 2),
68 PART II PRINCIPLES

Biographical Box 3.2

Prince Henry the Navigator


Prince Henry of Portugal, who died in 1460, was known as Henry the
Navigator because of his keen interest in exploration. In 1433 Prince Henry
sent a ship from Portugal to explore the west coast of Africa in an attempt
to find a sea route to the Spice Islands. This ship was the first to travel
south of Cape Bojador (latitude 26 degrees 20 minutes N). To make this
and other voyages Prince Henry assembled a team of map-makers, sea
captains, geographers, ship builders, and many other skilled craftsmen.
Prince Henry showed the way for Vasco da Gama and other famous 15th
century explorers. His management skills could be applied in much the
same way in today’s GIS projects.

Figure 3.2 Prince Henry the


Navigator, originator of the Age of
Discovery in the 15th century, and
promoter of a systematic approach to
the acquisition, compilation, and
dissemination of geographic
knowledge

since representation is at the heart of our ability to solve when decisions have to be made about the geographic
problems using digital tools. Any application of GIS world, it is effective to experiment first on models or rep-
requires clear attention to questions of what should be resentations, exploring different scenarios. Of course this
represented, and how. There is a multitude of possible works only if the representation behaves as the real air-
ways of representing the geographic world in digital form, craft or world does, and a great deal of knowledge must
none of which is perfect, and none of which is ideal for be acquired about the world before an accurate representa-
all applications. tion can be built that permits such simulations. But the use
of representations for training, exploring future scenarios,
The key GIS representation issues are what to and recreating the past is now common in many fields,
represent and how to represent it. including surgery, chemistry, and engineering, and with
technologies like GIS is becoming increasingly common
One of the most important criteria for the usefulness
in dealing with the geographic world.
of a representation is its accuracy. Because the geo-
graphic world is seemingly of infinite complexity, there Many plans for the real world can be tried out first
are always choices to be made in building any represen- on models or representations.
tation – what to include, and what to leave out. When US
President Thomas Jefferson dispatched Meriwether Lewis
to explore and report on the nature of the lands from the
upper Missouri to the Pacific, he said Lewis possessed ‘a
fidelity to the truth so scrupulous that whatever he should
report would be as certain as if seen by ourselves’. But he 3.4 The fundamental problem
clearly didn’t expect Lewis to report everything he saw in
complete detail: Lewis exercised a large amount of judg-
ment about what to report, and what to omit. The question Geographic data are built up from atomic elements, or
of accuracy is taken up at length in Chapter 6. facts about the geographic world. At its most primitive,
One more vital interest drives our need for represen- an atom of geographic data (strictly, a datum) links a
tations of the geographic world, and also the need for place, often a time, and some descriptive property. The
representations in many other human activities. When a first of these, place, is specified in one of several ways
pilot must train to fly a new type of aircraft, it is much that are discussed at length in Chapter 5, and there are
cheaper and less risky for him or her to work with a also many ways of specifying the second, time. We often
flight simulator than with the real aircraft. Flight simu- use the term attribute to refer to the last of these three.
lators can represent a much wider range of conditions For example, consider the statement ‘The temperature at
than a pilot will normally experience in flying. Similarly, local noon on December 2nd 2004 at latitude 34 degrees
CHAPTER 3 REPRESENTING GEOGRAPHY 69
45 minutes north, longitude 120 degrees 0 minutes west, some rapidly. Some attributes are physical or environ-
was 18 degrees Celsius’. It ties location and time to the mental in nature, while others are social or economic.
property or attribute of atmospheric temperature. Some attributes simply identify a place or an entity, dis-
tinguishing it from all other places or entities – examples
Geographic data link place, time, and attributes. include street addresses, social security numbers, or the
Other facts can be broken down into their primitive parcel numbers used for recording land ownership. Other
atoms. For example, the statement ‘Mount Everest is attributes measure something at a location and perhaps at
8848 m high’ can be derived from two atomic geographic a time (e.g., atmospheric temperature or elevation), while
facts, one giving the location of Mt Everest in latitude others classify into categories (e.g., the class of land use,
and longitude, and the other giving the elevation at that differentiating between agriculture, industry, or residential
latitude and longitude. Note, however, that the statement land). Because attributes are important outside the domain
would not be a geographic fact to a community that had of GIS there are standard terms for the different types (see
no way of knowing where Mt Everest is located. Box 3.3).
Many aspects of the Earth’s surface are comparatively
static and slow to change. Height above sea level Geographic attributes are classified as nominal,
changes slowly because of erosion and movements ordinal, interval, ratio, and cyclic.
of the Earth’s crust, but these processes operate on
scales of hundreds or thousands of years, and for most But this idea of recording atoms of geographic infor-
applications except geophysics we can safely omit time mation, combining location, time, and attribute, misses a
from the representation of elevation. On the other hand fundamental problem, which is that the world is in effect
atmospheric temperature changes daily, and dramatic infinitely complex, and the number of atoms required for
changes sometimes occur in minutes with the passage a complete representation is similarly infinite. The closer
of a cold front or thunderstorm, so time is distinctly we look at the world, the more detail it reveals – and it
important, though such climatic variables as mean annual seems that this process extends ad infinitum. The shoreline
temperature can be represented as static. of Maine appears complex on a map, but even more com-
The range of attributes in geographic information is plex when examined in greater detail, and as more detail
vast. We have already seen that some vary slowly and is revealed the shoreline appears to get longer and longer,

Technical Box 3.3

Types of attributes
The simplest type of attribute, termed nominal, Attributes are interval if the differences
is one that serves only to identify or distinguish between values make sense. The scale of Celsius
one entity from another. Placenames are a good temperature is interval, because it makes sense
example, as are names of houses, or the numbers to say that 30 and 20 are as different as 20 and
on a driver’s license – each serves only to identify 10. Attributes are ratio if the ratios between
the particular instance of a class of entities and values make sense. Weight is ratio, because it
to distinguish it from other members of the makes sense to say that a person of 100 kg is
same class. Nominal attributes include numbers, twice as heavy as a person of 50 kg; but Celsius
letters, and even colors. Even though a nominal temperature is only interval, because 20 is not
attribute can be numeric it makes no sense to twice as hot as 10 (and this argument applies
apply arithmetic operations to it: adding two to all scales that are based on similarly arbitrary
nominal attributes, such as two drivers’ license zero points, including longitude).
numbers, creates nonsense. In GIS it is sometimes necessary to deal with
Attributes are ordinal if their values have data that fall into categories beyond these
a natural order. For example, Canada rates its four. For example, data can be directional or
agricultural land by classes of soil quality, with cyclic, including flow direction on a map, or
Class 1 being the best, Class 2 not so good, compass direction, or longitude, or month of
etc. Adding or taking ratios of such numbers the year. The special problem here is that the
makes little sense, since 2 is not twice as much number following 359 degrees is 0. Averaging
of anything as 1, but at least ordinal attributes two directions such as 359 and 1 yields 180, so
have inherent order. Averaging makes no sense the average of two directions close to North can
either, but the median, or the value such that appear to be South. Because cyclic data occur
half of the attributes are higher-ranked and half sometimes in GIS, and few designers of GIS
are lower-ranked, is an effective substitute for software have made special arrangements for
the average for ordinal data as it gives a useful them, it is important to be alert to the problems
central value. that may arise.
70 PART II PRINCIPLES
and more and more convoluted (see Figure 4.18). To char- example, in describing the elevation of the Earth’s surface
acterize the world completely we would have to specify we could take advantage of the fact that roughly two-
the location of every person, every blade of grass, and thirds of the surface is covered by water, with its surface
every grain of sand – in fact, every subatomic particle, at sea level. Of the 5 million pieces of information needed
clearly an impossible task, since the Heisenberg uncer- to describe elevation at 10 km resolution, approximately
tainty principle places limits on the ability to measure 3.4 million will be recorded as zero, a colossal waste.
precise positions of subatomic particles. So in practice any If we could find an efficient way of identifying the area
representation must be partial – it must limit the level of covered by water, then we would need only 1.6 million
detail provided, or ignore change through time, or ignore real pieces of information.
certain attributes, or simplify in some other way. Humans have found many ingenious ways of describ-
ing the Earth’s surface efficiently, because the problem
The world is infinitely complex, but computer we are addressing is as old as representation itself, and
systems are finite. Representations must somehow as important for paper-based representations as it is for
limit the amount of detail captured. binary representations in computers. But this ingenuity is
One very common way of limiting detail is by itself the source of a substantial problem for GIS: there
throwing away or ignoring information that applies only are many ways of representing the Earth’s surface, and
to small areas, in other words not looking too closely. users of GIS thus face difficult and at times confusing
The image you see on a computer screen is composed of choices. This chapter discusses some of those choices, and
a million or so basic elements or pixels, and if the whole the issues are pursued further in subsequent chapters on
Earth were displayed at once each pixel would cover an uncertainty (Chapter 6) and data modeling (Chapter 8).
area roughly 10 km on a side, or about 100 sq km. At this Representation remains a major concern of GIScience,
level of detail the island of Manhattan occupies roughly 10 and researchers are constantly looking for ways to extend
pixels, and virtually everything on it is a blur. We would GIS representations to accommodate new types of infor-
say that such an image has a spatial resolution of about mation (Box 3.5).
10 km, and know that anything much less than 10 km
across is virtually invisible. Figure 3.3 shows Manhattan
at a spatial resolution of 250 m, detailed enough to pick
out the shape of the island and Central Park.
It is easy to see how this helps with the problem of 3.5 Discrete objects and
too much information. The Earth’s surface covers about
500 million sq km, so if this level of detail is sufficient continuous fields
for an application, a property of the surface such as
elevation can be described with only 5 million pieces
of information, instead of the 500 million it would take
to describe elevation with a resolution of 1 km, and
the 500 trillion (500 000 000 000 000) it would take to 3.5.1 Discrete objects
describe elevation with 1 m resolution.
Another strategy for limiting detail is to observe that Mention has already been made of the level of detail as
many properties remain constant over large areas. For a fundamental choice in representation. Another, perhaps
even more fundamental choice, is between two conceptual
schemes. There is good evidence that we as humans like to
simplify the world around us by naming things, and seeing
individual things as instances of broader categories. We
prefer a world of black and white, of good guys and bad
guys, to the real world of shades of gray.
The two fundamental ways of representing
geography are discrete objects and
continuous fields.
This preference is reflected in one way of viewing
the geographic world, known as the discrete object view.
In this view, the world is empty, except where it is
occupied by objects with well-defined boundaries that
are instances of generally recognized categories. Just as
Figure 3.3 An image of Manhattan taken by the MODIS the desktop is littered with books, pencils, or computers,
instrument on board the TERRA satellite on September 12, the geographic world is littered with cars, houses, lamp-
2001. MODIS has a spatial resolution of about 250 m, detailed posts, and other discrete objects. Thus the landscape
enough to reveal the coarse shape of Manhattan and to identify of Minnesota is littered with lakes, and the landscape
the Hudson and East Rivers, the burning World Trade Center of Scotland is littered with mountains. One characteristic
(white spot), and Central Park (the gray blur with the of the discrete object view is that objects can be counted,
Jacqueline Kennedy Onassis Reservoir visible as a black dot) so license plates issued by the State of Minnesota carry
CHAPTER 3 REPRESENTING GEOGRAPHY 71

Figure 3.4 The problems of representing a three-dimensional


world using a two-dimensional technology. The intersection of
links A, B, C, and D is an overpass, so no turns are possible
between such pairs as A and B

the legend ‘10 000 lakes’, and climbers know that there
are exactly 284 mountains in Scotland over 3000 ft (the
so-called Munros, from Sir Hugh Munro who originally
listed 277 of them in 1891 – the count was expanded to
284 in 1997).
The discrete object view represents the geographic
world as objects with well-defined boundaries in
otherwise empty space.
Biological organisms fit this model well, and this
allows us to count the number of residents in an area
of a city, or to describe the behavior of individual bears.
Manufactured objects also fit the model, and we have
little difficulty counting the number of cars produced in
a year, or the number of airplanes owned by an airline.
But other phenomena are messier. It is not at all clear
what constitutes a mountain, for example, or exactly how Figure 3.5 Bears are easily conceived as discrete objects,
a mountain differs from a hill, or when a mountain with maintaining their identity as objects through time and
surrounded by empty space
two peaks should be counted as two mountains.
Geographic objects are identified by their dimensional-
ity. Objects that occupy area are termed two-dimensional, The discrete object view leads to a powerful way of
and generally referred to as areas. The term polygon is representing geographic information about objects. Think
also common for technical reasons explained later. Other of a class of objects of the same dimensionality – for
objects are more like one-dimensional lines, including example, all of the Brown bears (Figure 3.5) in the Kenai
roads, railways, or rivers, and are often represented as Peninsula of Alaska. We would naturally think of these
one-dimensional objects and generally referred to as lines. objects as points. We might want to know the sex of
Other objects are more like zero-dimensional points, such each bear, and its date of birth, if our interests were in
as individual animals or buildings, and are referred to monitoring the bear population. We might also have a
as points. collar on each bear that transmitted the bear’s location
Of course, in reality, all objects that are perceptible to at regular intervals. All of this information could be
humans are three dimensional, and their representation in expressed in a table, such as the one shown in Table 3.1,
fewer dimensions can be at best an approximation. But the with each row corresponding to a different discrete object,
ability of GIS to handle truly three-dimensional objects and each column to an attribute of the object. To reinforce
as volumes with associated surfaces is very limited. a point made earlier, this is a very efficient way of
Some GIS allow for a third (vertical) coordinate to be capturing raw geographic information on Brown bears.
specified for all point locations. Buildings are sometimes But it is not perfect as a representation for all
represented by assigning height as an attribute, though if geographic phenomena. Imagine visiting the Earth from
this option is used it is impossible to distinguish flat roofs another planet, and asking the humans what they chose as
from any other kind. Various strategies have been used for a representation for the infinitely complex and beautiful
representing overpasses and underpasses in transportation environment around them. The visitor would hardly be
networks, because this information is vital for navigation impressed to learn that they chose tables, especially when
but not normally represented in strictly two-dimensional the phenomena represented were natural phenomena such
network representations. One common strategy is to as rivers, landscapes, or oceans. Nothing on the natural
represent turning options at every intersection – so an Earth looks remotely like a table. It is not at all clear how
overpass appears in the database as an intersection with the properties of a river should be represented as a table,
no turns (Figure 3.4). or the properties of an ocean. So while the discrete object
72 PART II PRINCIPLES
Table 3.1 Example of representation of geographic in a landscape that has been worn down by glaciation
information as a table: the locations and attributes of each of or flattened by blowing sand than one recently created
four Brown bears in the Kenai Peninsula of Alaska. Locations by cooling lava. Cliffs are places in continuous fields
have been obtained from radio collars. Only one location is where elevation changes suddenly, rather than smoothly.
shown for each bear, at noon on July 31 2003 (imaginary data) Population density is a kind of continuous field, defined
everywhere as the number of people per unit area, though
Bear Sex Estimated Date of collar Location, the definition breaks down if the field is examined
ID year of installation noon on 31 July so closely that the individual people become visible.
birth 2003 Continuous fields can also be created from classifications
of land, into categories of land use, or soil type. Such
001 M 1999 02242003 −150.6432, 60.0567 fields change suddenly at the boundaries between different
002 F 1997 03312003 −149.9979, 59.9665 classes. Other types of fields can be defined by continuous
003 F 1994 04212003 −150.4639, 60.1245 variation along lines, rather than across space. Traffic
004 F 1995 04212003 −150.4692, 60.1152 density, for example, can be defined everywhere on a
road network, and flow volume can be defined everywhere
on a river. Figure 3.6 shows some examples of field-
view works well for some kinds of phenomena, it misses like phenomena.
the mark badly for others. Continuous fields can be distinguished by what is
being measured at each point. Like the attribute types
discussed in Box 3.3, the variable may be nominal,
3.5.2 Continuous fields ordinal, interval, ratio, or cyclic. A vector field assigns
two variables, magnitude and direction, at every point in
While we might think of terrain as composed of discrete space, and is used to represent flow phenomena such as
mountain peaks, valleys, ridges, slopes, etc., and think winds or currents; fields of only one variable are termed
of listing them in tables and counting them, there are scalar fields.
unresolvable problems of definition for all of these Here is a simple example illustrating the difference
objects. Instead, it is much more useful to think of terrain between the discrete object and field conceptualizations.
as a continuous surface, in which elevation can be defined Suppose you were hired for the summer to count the
rigorously at every point (see Box 3.4). Such continuous number of lakes in Minnesota, and promised that your
surfaces form the basis of the other common view of answer would appear on every license plate issued by the
geographic phenomena, known as the continuous field state. The task sounds simple, and you were happy to
view (and not to be confused with other meanings of get the job. But on the first day you started to run into
the word field). In this view the geographic world can difficulty (Figure 3.7). What about small ponds, do they
be described by a number of variables, each measurable count as lakes? What about wide stretches of rivers? What
at any point on the Earth’s surface, and changing in value about swamps that dry up in the summer? What about a
across the surface. lake with a narrow section connecting two wider parts, is
it one lake or two? Your biggest dilemma concerns the
The continuous field view represents the real scale of mapping, since the number of lakes shown on a
world as a finite number of variables, each one map clearly depends on the map’s level of detail – a more
defined at every possible position.
detailed map almost certainly will show more lakes.
Your task clearly reflects a discrete object view of the
Objects are distinguished by their dimensions, and phenomenon. The action of counting implies that lakes are
naturally fall into categories of points, lines, or areas. discrete, two-dimensional objects littering an otherwise
Continuous fields, on the other hand, can be distinguished empty geographic landscape. In a continuous field view,
by what varies, and how smoothly. A continuous field on the other hand, all points are either lake or non-lake.
of elevation, for example, varies much more smoothly Moreover, we could refine the scale a little to take account

Technical Box 3.4

2.5 dimensions
Areas are two-dimensional objects, and volumes representation is only necessary in areas with
are three dimensional, but GIS users sometimes an abundance of overhanging cliffs or caves,
talk about ‘2.5-D’. Almost without exception the if these are important features. The idea of
elevation of the Earth’s surface has a single value dealing with a three-dimensional phenomenon
at any location (exceptions include overhanging by treating it as a single-valued function of two
cliffs). So elevation is conveniently thought of horizontal variables gives rise to the term ‘2.5-
as a continuous field, a variable with a value D’. Figure 3.6B shows an example, in this case an
everywhere in two dimensions, and a full 3-D elevation surface.
CHAPTER 3 REPRESENTING GEOGRAPHY 73

(A)

(B)

Figure 3.6 Examples of field-like phenomena. (A) Image of part of the Dead Sea in the Middle East. The lightness of the image at
any point measures the amount of radiation captured by the satellite’s imaging system. (B) A simulated image derived from the
Shuttle Radar Topography Mission, a new source of high-quality elevation data. The image shows the Carrizo Plain area of Southern
California, USA, with a simulated sky and with land cover obtained from other satellite sources (Courtesy NASA/JPL–Caltech)

of marginal cases; for example, we might define the scale would still be problems in defining the levels of the scale).
shown in Table 3.2, which has five degrees of lakeness. Instead of counting, our strategy would be to lay a grid
The complexity of the view would depend on how closely over the map, and assign each grid cell a score on the
we looked, of course, and so the scale of mapping would lakeness scale. The size of the grid cell would determine
still be important. But all of the problems of defining how accurately the result approximated the value we could
a lake as a discrete object would disappear (though there theoretically obtain by visiting every one of the infinite
74 PART II PRINCIPLES
were released from molecules of silver nitrate when the
unstable molecules were exposed to light, thus darkening
the image in proportion to the amount of incident light.
We think of the image as a field of continuous variation
in color or darkness. But when we look at the image,
the eye and brain begin to infer the presence of discrete
objects, such as people, rivers, fields, cars, or houses, as
they interpret the content of the image.

3.6 Rasters and vectors

Continuous fields and discrete objects define two con-


ceptual views of geographic phenomena, but they do not
solve the problem of digital representation. A continuous
field view still potentially contains an infinite amount of
information if it defines the value of the variable at every
point, since there is an infinite number of points in any
defined geographic area. Discrete objects can also require
an infinite amount of information for full description – for
example, a coastline contains an infinite amount of infor-
mation if it is mapped in infinite detail. Thus continuous
fields and discrete objects are no more than conceptu-
alizations, or ways in which we think about geographic
phenomena; they are not designed to deal with the limi-
tations of computers.
Two methods are used to reduce geographic phenom-
Figure 3.7 Lakes are difficult to conceptualize as discrete ena to forms that can be coded in computer databases,
objects because it is often difficult to tell where a lake begins and we call these raster and vector. In principle, both can
and ends, or to distinguish a wide river from a lake be used to code both fields and discrete objects, but in
practice there is a strong association between raster and
fields, and between vector and discrete objects.
Table 3.2 A scale of lakeness suitable for defining lakes as a
continuous field Raster and vector are two methods of representing
geographic data in digital computers.
Lakeness Definition

1 Location is always dry under all circumstances


2 Location is sometimes flooded in Spring
3.6.1 Raster data
3 Location supports marshy vegetation
In a raster representation space is divided into an array
4 Water is always present to a depth of less of rectangular (usually square) cells (Figure 3.8). All
than 1 m geographic variation is then expressed by assigning
5 Water is always present to a depth of more properties or attributes to these cells. The cells are
than 1 m sometimes called pixels (short for picture elements).
Raster representations divide the world into arrays
of cells and assign attributes to the cells.
number of points in the state. At the end, we would One of the commonest forms of raster data comes
tabulate the resulting scores, counting the number of cells from remote-sensing satellites, which capture information
having each value of lakeness, or averaging the lakeness in this form and send it to ground to be distributed and
score. We could even design a new and scientifically analyzed. Data from the Landsat Thematic Mapper, for
more reasonable license plate – ‘Minnesota, 12% lake’ or example, which are commonly used in GIS applications,
‘Minnesota, average lakeness 2.02’. come in cells that are 30 m a side on the ground, or
The difference between objects and fields is also approximately 0.1 hectare in area. Other similar data can
illustrated well by photographs (e.g., Figure 3.6A). The be obtained from sensors mounted on aircraft. Imagery
image in a photograph is created by variation in the varies according to the spatial resolution (expressed as
chemical state of the material in the photographic the length of a cell side as measured on the ground), and
film – in early photography, minute particles of silver also according to the timetable of image capture by the
CHAPTER 3 REPRESENTING GEOGRAPHY 75

(A)

(B)

Figure 3.8 Raster representation. Each color represents a


different value of a nominal-scale variable denoting land
cover class

sensor. Some satellites are in geostationary orbit over a


fixed point on the Earth, and capture images constantly. Figure 3.9 Effect of a raster representation using (A) the
Others pass over a fixed point at regular intervals (e.g., largest share rule and (B) the central point rule
every 12 days). Finally, sensors vary according to the
part or parts of the spectrum that they sense. The visible
parts of the spectrum are most important for remote assigned to the whole cell. Figure 3.9 shows these two
sensing, but some invisible parts of the spectrum are rules in operation. The largest share rule is almost always
particularly useful in detecting heat, and the phenomena preferred, but the central point rule is sometimes used
that produce heat, such as volcanic activities. Many in the interests of faster computing, and is often used in
sensors capture images in several areas of the spectrum, creating raster datasets of elevation.
or bands, simultaneously, because the relative amounts of
radiation in different parts of the spectrum are often useful
indicators of certain phenomena, such as green leaves, 3.6.2 Vector data
or water, on the Earth’s surface. The AVIRIS (Airborne
Visible InfraRed Imaging Spectrometer) captures no fewer In a vector representation, all lines are captured as
than 224 different parts of the spectrum, and is being points connected by precisely straight lines (some GIS
used to detect particular minerals in the soil, among other software allows points to be connected by curves rather
applications. Remote sensing is a complex topic, and than straight lines, but in most cases curves have to
further details are available in Chapter 9. be approximated by increasing the density of points).
Square cells fit together nicely on a flat table or a An area is captured as a series of points or vertices
sheet of paper, but they will not fit together neatly on the connected by straight lines as shown in Figure 3.10. The
curved surface of the Earth. So just as representations on straight edges between vertices explain why areas in
paper require that the Earth be flattened, or projected, so vector representation are often called polygons, and in
too do rasters (because of the distortions associated with GIS-speak the terms polygon and area are often used
flattening, the cells in a raster can never be perfectly equal
in shape or area on the Earth’s surface). Projections, or
ways of flattening the Earth, are described in Section 5.7.
Many of the terms that describe rasters suggest the laying
of a tile floor on a flat surface – we talk of raster cells
tiling an area, and a raster is said to be an instance of
a tesselation, derived from the word for a mosaic. The
mirrored ball hanging above a dance floor recalls the
impossibility of covering a spherical object like the Earth
perfectly with flat, square pieces.
When information is represented in raster form all
detail about variation within cells is lost, and instead
the cell is given a single value. Suppose we wanted to
represent the map of the counties of Texas as a raster.
Each cell would be given a single value to identify a
county, and we would have to decide the rule to apply
when a cell falls in more than one county. Often the rule
is that the county with the largest share of the cell’s
area gets the cell. Sometimes the rule is based on the Figure 3.10 An area (red line) and its approximation by a
central point of the cell, and the county at that point is polygon (blue line)
76 PART II PRINCIPLES
interchangeably. Lines are captured in the same way, C. capturing a single value of the variable for a
and the term polyline has been coined to describe a regularly shaped cell (for example, values of reflected
curved line represented by a series of straight segments radiation in a remotely sensed scene);
connecting vertices. D. capturing a single value of the variable over an
To capture an area object in vector form, we need only irregularly shaped area (for example, vegetation
specify the locations of the points that form the vertices cover class or the name of a parcel’s owner);
of a polygon. This seems simple, and also much more
efficient than a raster representation, which would require E. capturing the linear variation of the field variable
us to list all of the cells that form the area. These ideas over an irregularly shaped triangle (for example,
are captured succinctly in the comment ‘Raster is vaster, elevation captured in a triangulated irregular network
or TIN, Section 9.2.3.4);
and vector is correcter’. To create a precise approximation
to an area in raster, it would be necessary to resort to F. capturing the isolines of a surface, as digitized lines
using very small cells, and the number of cells would (for example, digitized contour lines representing
rise proportionately (in fact, every halving of the width surface elevation).
and height of each cell would result in a quadrupling Each of these methods succeeds in compressing the
of the number of cells). But things are not quite as potentially infinite amount of data in a continuous field
simple as they seem. The apparent precision of vector to a finite amount, using one of the six options, two of
is often unreasonable, since many geographic phenomena which (A and C) are raster, and four (B, D, E, and F)
simply cannot be located with high accuracy. So although are vector. Of the vector methods one (B) uses points,
raster data may look less attractive, they may be more two (D and E) use polygons, and one (F) uses lines
honest to the inherent quality of the data. Also, various to express the continuous spatial variation of the field
methods exist for compressing raster data that can greatly in terms of a finite set of vector objects. But unlike
reduce the capacity needed to store a given dataset (see the discrete object conceptualization, the objects used to
Chapter 8). So the choice between raster and vector is represent a field are not real, but simply artifacts of the
often complex, as summarized in Table 3.3. representation of something that is actually conceived as
spatially continuous. The triangles of a TIN representation
(E), for example, exist only in the digital representation,
3.6.3 Representing continuous fields and cannot be found on the ground, and neither can the
lines of a contour representation (F).
While discrete objects lend themselves naturally to
representation as points, lines, or areas using vector
methods, it is less obvious how the continuous variation of
a field can be expressed in a digital representation. In GIS
six alternatives are commonly implemented (Figure 3.11): 3.7 The paper map
A. capturing the value of the variable at each of a grid
of regularly spaced sample points (for example, The paper map has long been a powerful and effective
elevations at 30 m spacing in a DEM); means of communicating geographic information. In
B. capturing the value of the field variable at each of a contrast to digital data, which use coding schemes such
set of irregularly spaced sample points (for example, as ASCII, it is an instance of an analog representation,
variation in surface temperature captured at or a physical model in which the real world is scaled – in
weather stations); the case of the paper map, part of the world is scaled
to fit the size of the paper. A key property of a paper
map is its scale or representative fraction, defined as the
Table 3.3 Relative advantages of raster and vector ratio of distance on the map to distance on the Earth’s
representation surface. For example, a map with a scale of 1:24 000
reduces everything on the Earth to one 24 000th of its
Issue Raster Vector
real size. This is a bit misleading, because the Earth’s
Volume of Depends on cell size Depends on density
surface is curved and a paper map is flat, so scale cannot
be exactly constant.
data of vertices
Sources of Remote sensing, Social and A paper map is: a source of data for geographic
data imagery environmental databases; an analog product from a GIS; and an
data
effective communication tool.
Applications Resources, Social, economic,
environmental administrative Maps have been so important, particularly prior to the
Software Raster GIS, image Vector GIS, development of digital technology, that many of the ideas
processing automated associated with GIS are actually inherited directly from
cartography paper maps. For example, scale is often cited as a property
Resolution Fixed Variable of a digital database, even though the definition of scale
makes no sense for digital data – ratio of distance in the
CHAPTER 3 REPRESENTING GEOGRAPHY 77

Figure 3.11 The six approximate representations of a field used in GIS. (A) Regularly spaced sample points. (B) Irregularly spaced
sample points. (C) Rectangular cells. (D) Irregularly shaped polygons. (E) Irregular network of triangles, with linear variation over
each triangle (the Triangulated Irregular Network or TIN model; the bounding box is shown dashed in this case because the unshown
portions of complete triangles extend outside it). (F) Polylines representing contours (see the discussion of isopleth maps in Box 4.3)
(Courtesy US Geological Survey)

Biographical Box 3.5

May Yuan and new forms of representation


May Yuan received her Bachelor of Science degree in Geography from
the National Taiwan University, where she was attracted to the fields of
geomorphology and climatology. Continuing her fundamental interest
in evolution of processes, she studied geographic representation and
temporal GIS and earned both her Masters and PhD degrees in Geography
from the State University of New York at Buffalo.
Currently, May is an Associate Professor of Geography at the University
of Oklahoma. Severe weather in the Southern Plains of the United
States (Figure 3.12) has inspired her to re-evaluate GIS representation
of geographic dynamics, the complexity of events and processes at spatial
and temporal scales, and GIS applications in meteorology (i.e., weather
and climate). She investigates meteorological cases (e.g., convective storms
and flash floods) to develop new ideas of using events and processes as Figure 3.13 May Yuan, developer
the basis to integrate spatial and temporal data in GIS. Her publications of new forms of representation
address theoretical issues on representation of geographic dynamics and
offer conceptual models and a prototype GIS to support spatiotemporal queries and analysis of dynamic
geographic phenomena. Her temporal GIS research goes beyond merely considering time as an attribute
or annotation of spatial objects to incorporate much richer spatiotemporal meaning. In her case study on
convective storms, she has demonstrated that, by modeling storms as data objects, GIS is able to support
information query about storm evolution, storm behaviors, and interactions with environments.
May developed a strong interest in physics in early childhood. Newton’s theory of universal gravitation
sparked her appreciation for simple principles that can explain how things work and for the use of
graphical and symbolic representation to conceptualize complex processes. Planck’s quantum theory and
Heisenberg’s uncertainty principle further stimulated her thinking on the nature of matter and its behavior
at different scales of observations. Shaped by Einstein’s theory of relativity, May developed her world view
as a four-dimensional space-time continuum populated with events and phenomena. Before she pursued a
career in GIScience, May studied fluvial processes and developed a model to classify waterfalls and explain
78 PART II PRINCIPLES

Figure 3.12 Representative radar images showing the evolution of supercell storms that produced F5 tornadoes in Oklahoma City,
May 3, 1999. WSR-88D radar TKLX scanned the supercells every five minutes, but the images shown here were selected
approximately every two hours

their formation. She went on to study paleoclimatology by analyzing soil and speleothem sediments. Both
studies, as well as her dissertation research on wildfire representation, reinforced her interest in developing
conceptual models of processes and examining the relationships between space and time. Since she moved
to the University of Oklahoma, a suite of world-class meteorological research initiatives has offered her
unique opportunities to extend her interest in physics to fundamental research in GIScience through
meteorological applications. Weather and climate offer rich cases that emphasize movement, processes, and
evolution and pose grand challenges to GIScience research regarding representation, object-field duality,
and uncertainty. May enjoys the challenges that ultimately connect to her fundamental interest in how
things work.

computer to distance on the ground; how can there be in Chapter 6, where it is important to the concept of
distances in a computer? What is meant is a little more uncertainty.
complicated: when a scale is quoted for a digital database There is a close relationship between the contents
it is usually the scale of the map that formed the source of a map and the raster and vector representations
of the data. So if a database is said to be at a scale of discussed in the previous section. The US Geological
1:24 000 one can safely assume that it was created from Survey, for example, distributes two digital versions of
a paper map at that scale, and includes representations its topographic maps, one in raster form and one in
of the features that are found on maps at that scale. vector form, and both attempt to capture the contents
Further discussion of scale can be found in Box 4.2 and of the map as closely as possible. In the raster form, or
CHAPTER 3 REPRESENTING GEOGRAPHY 79

Figure 3.14 Part of a Digital Raster Graphic, a scan a US Geological Survey 1:24 000 topographic map

digital raster graphic (DRG), the map is scanned at a representation of the map and its digital equivalent. So it
very high density, using very small pixels, so that the is quite misleading to think of the contents of a digital
raster looks very much like the original (Figure 3.14). representation as a map, and to think of a GIS as a
The coding of each pixel simply records the color of container of digital maps. Digital representations can
the map picked up by the scanner, and the dataset include information that would be very difficult to show
includes all of the textual information surrounding the on maps. For example, they can represent the curved
actual map. surface of the Earth, without the need for the distortions
In the vector form, or digital line graph (DLG), every associated with flattening. They can represent changes,
geographic feature shown on the map is represented as a whereas maps must be static because it is very difficult
point, polyline, or polygon. The symbols used to represent to change their contents once they have been printed or
point features on the map, such as the symbol for a drawn. Digital databases can represent all three spatial
windmill, are replaced in the digital data by points with dimensions, including the vertical, whereas maps must
associated attributes, and must be regenerated when the always show two-dimensional views. So while the paper
data are displayed. Contours, which are shown on the map map is a useful metaphor for the contents of a geographic
as lines of definite width, are replaced by polylines of no database, we must be careful not to let it limit our thinking
width, and given attributes that record their elevations. about what is possible in the way of representation. This
In both cases, and especially in the vector case, issue is pursued at greater length in Chapter 8, and map
there is a significant difference between the analog production is discussed in detail in Chapter 12.
80 PART II PRINCIPLES

3.8.1 Methods of generalization


3.8 Generalization
A GIS dataset’s level of detail is one of its most
important properties, as it determines both the degree
to which the dataset approximates the real world, and
In Section 3.4 we saw how thinking about geographic the dataset’s complexity. It is often necessary to remove
information as a collection of atomic links – between a detail, in the interests of compressing data, fitting them
place, a time (not always, because many geographic facts into a storage device of limited capacity, processing
are stated as if they were permanently true), and a prop- them faster, or creating less confusing visualizations that
erty – led to an immediate problem, because the potential emphasize general trends. Consequently many methods
number of such atomic facts is infinite. If seen in enough have been devised for generalization, and several of the
detail, the Earth’s surface is unimaginably complex, and more important are discussed in this section.
its effective description impossible. So instead, humans McMaster and Shea (1992) identify the following types
have devised numerous ways of simplifying their view of of generalization rules:
the world. Instead of making statements about each and
every point, we describe entire areas, attributing uniform ■ simplification, for example by weeding out points in
characteristics to them, even when areas are not strictly the outline of a polygon to create a simpler shape;
uniform; we identify features on the ground and describe ■ smoothing, or the replacement of sharp and complex
their characteristics, again assuming them to be uniform; forms by smoother ones;
or we limit our descriptions to what exists at a finite num- ■ aggregation, or the replacement of a large number of
ber of sample points, hoping that these samples will be distinct symbolized objects by a smaller number of
adequately representative of the whole (Section 4.4). new symbols;
■ amalgamation, or the replacement of several area
A geographic database cannot contain a perfect
objects by a single area object;
description – instead, its contents must be
■ merging, or the replacement of several line objects by
carefully selected to fit within the limited capacity
a smaller number of line objects;
of computer storage devices.
■ collapse, or the replacement of an area object by a
From this perspective some degree of generalization is combination of point and line objects;
almost inevitable in all geographic data. But cartographers ■ refinement, or the replacement of a complex pattern of
often take a somewhat different approach, for which this objects by a selection that preserves the pattern’s
observation is not necessarily true. Suppose we are tasked general form;
to prepare a map at a specific scale, say 1:25 000, using the ■ exaggeration, or the relative enlargement of an object
standards laid down by a national mapping agency, such to preserve its characteristics when these would be
as the Institut Géographique National (IGN) of France. lost if the object were shown to scale;
Every scale used by IGN has its associated rules of ■ enhancement, through the alteration of the physical
representation. For example, at a scale of 1:25 000 the sizes and shapes of symbols; and
rules lay down that individual buildings will be shown
■ displacement, or the moving of objects from their true
only in specific circumstances, and similar rules apply to
positions to preserve their visibility and
the 1:24 000 series of the US Geological Survey. These
distinctiveness.
rules are known by various names, including terrain
nominal in the case of IGN, which translates roughly but The differences between these types of rules are
not very helpfully to ‘nominal ground’, and is perhaps much easier to understand visually and Figure 3.15 repro-
better translated as ‘specification’. From this perspective duces McMaster’s and Shea’s original example drawings.
a map that represents the world by following the rules In addition, they describe two forms of generalization
of a specification precisely can be perfectly accurate with of attributes, as distinct from geometric forms of gen-
respect to the specification, even though it is not a perfect eralization. Classification generalization reclassifies the
representation of the full detail on the ground. attributes of objects into a smaller number of classes,
while symbolization generalization changes the assign-
A map’s specification defines how real features on ment of symbols to objects. For example, it might replace
the ground are selected for inclusion on the map. an elaborate symbol including the words ‘Mixed Forest’
with a color identifying that class.
Consider the representation of vegetation cover using
the rules of a specification. For example, the rules might
state that at a scale of 1:100 000, a vegetation cover map 3.8.2 Weeding
should not show areas of vegetation that cover less than
1 hectare. But small areas of vegetation almost certainly One of the commonest forms of generalization in GIS
exist, so deleting them inevitably results in information is the process known as weeding, or the simplification
loss. But under the principle discussed above, a map that of the representation of a line represented as a polyline.
adheres to this rule must be accurate, even though it differs The process is an instance of McMaster and Shea’s
substantively from the truth as observed on the ground. simplification. Standard methods exist in GIS for doing
CHAPTER 3 REPRESENTING GEOGRAPHY 81

Spatial Spatial
Transformation Representation in Transformation Representation in
(Operator) Original Map Generalized Map (Operator) Original Map Generalized Map
At Original Map Scale At Original Map Scale

Simplification
At 50% Scale Smoothing
At 50% Scale

Spatial Spatial
Transformation Representation in Transformation Representation in
(Operator) Original Map Generalized Map (Operator) Original Map Generalized Map

At Original Map Scale At Original Map Scale

Lake Lake
Pueblo Ruins
Collapse Aggregation Miguel Ruins Ruins
At 50% Scale At 50% Scale

Lake Lake

Pueblo Ruins
Miguel Ruins
Ruins

Spatial Spatial
Transformation Representation in Transformation Representation in
(Operator) Original Map Generalized Map (Operator) Original Map Generalized Map
At Original Map Scale At Original Map Scale

Amalgamation Merge
At 50% Scale At 50% Scale

Spatial Spatial
Transformation Representation in Representation in
Transformation
(Operator) Original Map Generalized Map (Operator) Original Map Generalized Map
At Original Map Scale At Original Map Scale

Inlet Bay
Bay

Refinement Exaggeration Inlet


At 50% Scale At 50% Scale

Inlet Bay
Bay

Inlet

Spatial Spatial
Transformation Representation in Transformation Representation in
(Operator) Original Map Generalized Map (Operator) Original Map Generalized Map
At Original Map Scale At Original Map Scale

Enhancement Displacement
At 50% Scale At 50% Scale

Figure 3.15 Illustrations from McMaster and Shea (1992) of their ten forms of generalization. The original feature is shown at its
original level of detail, and below it at 50% coarser scale. Each generalization technique resolves a specific problem of display at
coarser scale and results in the acceptable version shown in the lower right
82 PART II PRINCIPLES
(A) 4

Tolerance

1
15
(B)
3
2

Figure 3.16 The Douglas–Poiker algorithm is designed to


simplify complex objects like this shoreline by reducing the
number of points in its polyline representation

this, and the commonest by far is the method known


as the Douglas–Poiker algorithm (Figure 3.16) after its
inventors, David Douglas and Tom Poiker. The operation
of the Douglas–Poiker weeding algorithm is shown in
Figure 3.17.
Weeding is the process of simplifying a line or area
by reducing the number of points in its
representation.
Note that the algorithm relies entirely on the assump-
tion that the line is represented as a polyline, in other
Figure 3.17 The Douglas–Poiker line simplification algorithm
words as a series of straight line segments. GIS increas- in action. The original polyline has 15 points. In (A) Points 1
ingly support other representations, including arcs of cir- and 15 are connected (red), and the furthest distance of any
cles, arcs of ellipses, and Bézier curves, but there is little point from this connection is identified (blue). This distance to
consensus to date on appropriate methods for weeding or Point 4 exceeds the user-defined tolerance. In (B) Points 1 and
generalizing them, or on methods of analysis that can be 4 are connected (green). Points 2 and 3 are within the tolerance
applied to them. of this line. Points 4 and 15 are connected, and the process is
repeated. In the final step 7 points remain (identified with green
disks), including 1 and 15. No points are beyond the
user-defined tolerance distance from the line
3.9 Conclusion
a point that will become much clearer on reading the
technical chapter on data modeling, Chapter 8. But the
Representation, or more broadly ontology, is a fundamen- broader issues of representation, including the distinction
tal issue in GIS, since it underlies all of our efforts to between field and object conceptualizations, underlie not
express useful information about the surface of the Earth only that chapter but many other issues as well, including
in a digital computer. The fact that there are so many ways uncertainty (Chapter 6), and Chapters 14 through 16 on
of doing this makes GIS at once complex and interesting, analysis and modeling.
CHAPTER 3 REPRESENTING GEOGRAPHY 83

Questions for further study 4. Identify the limits of your own neighborhood, and
start making a list of the discrete objects you are
1. What fraction of the Earth’s surface have you familiar with in the area. What features are hard to
experienced in your lifetime? Make diagrams like think of as discrete objects? For example, how will
that shown in Figure 3.1, at appropriate levels of you divide up the various roadways in the
detail, to show a) where you have lived in your neighborhood into discrete objects – where do they
lifetime, b) how you spent last weekend. How would begin and end?
you describe what is missing from each of
these diagrams?
2. Table 3.3 summarized some of the arguments Further reading
between raster and vector representations. Expand on Chrisman N.R. 2002 Exploring Geographic Information
these arguments, providing examples, and add any Systems (2nd edn). New York: Wiley.
others that would be relevant in a GIS application. McMaster R.B. and Shea K.S. 1992 Generalization in
3. The early explorers had limited ways of Digital Cartography. Washington, DC: Association of
communicating what they saw, but many were very American Geographers.
effective at it. Examine the published diaries, National Research Council 1999 Distributed Geolibraries:
notebooks, or dispatches of one or two early Spatial Information Resources. Washington, DC:
explorers and look at the methods they used to National Academy Press. Available: www.nap.edu.
communicate with others. What words did they use to
describe unfamiliar landscapes and how did they mix
words with sketches?

You might also like