7423 CH 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Part I

___________________________________________________________________

Data for Decision-Making

© 2008 by Taylor & Francis Group, LLC


CHAPTER 2

An Optimized Semi-Automated Methodology for


Populating a National Land-Use Dataset
W. Tompkinson, D. Morton, S. Gomm and E. Seaman
© Crown Copyright 2007. Reproduced by permission of Ordnance Survey

2.1 INTRODUCTION

Land use can be defined as the ‘activity or socio-economic function for which
land is used’1, p.2. Barnsley et al.2 note that the terms ‘land cover’ and ‘land use’ are
often used interchangeably within a single classification scheme and provide the
distinct definitions of land cover referring to the ‘physical materials on the surface
of a given parcel of land’ and land use being the ‘human activity that takes place
upon it’. Although the increased availability of high-resolution remotely-sensed
imagery offers the potential for the regular update of land cover data, land use
information cannot be derived solely from identification of land surface
characteristics. This is because land use is defined in terms of function rather than
physical form3, and consequently has a high dependency on in situ manual survey.
With the release of the Ordnance Survey (OS) MasterMap® Topography Layer4,
a national feature-based topographic dataset for Great Britain, there are
opportunities to integrate and associate a diverse range of data sources, and also to
infer the socio-economic function of topographic objects based upon their context
and relationships to other objects. This chapter outlines a methodology for
populating a land-use dataset based on the OS MasterMap® Topography Layer,
using object-based analysis techniques and information derived from existing
Ordnance Survey and third-party datasets. The aim of the research was to further
classify the land use of topographic objects in terms of their morphology and
spatial relationships using an object-based approach.

2.2 DATA

The research in this paper is based upon Great Britain’s large-scale geographic
information. Responsible for Great Britain’s National Geographic Database,
Ordnance Survey maintains digital topographic datasets that are surveyed at the
basic scales of 1:1250 in urban areas, 1:2500 in rural areas and 1:10000 in
mountain and moor land environments. In 1991, Ordnance Survey released Land-
Line®, a digital vector topographic dataset (composed of a ‘spaghetti’ vector data
structure, i.e., only points and lines) produced at these scales5.
11

© 2008 by Taylor & Francis Group, LLC


12 GIS for environmental decision-making

A successor to Land-Line®, the OS MasterMap® Topography Layer, was


introduced in 2001. This product comprises a seamless database of over 400
million objects that depict detailed vector topographic features such as houses, land
parcels and pavements6 and constitutes a major change in how Great Britain’s
topographic data are represented and manipulated within geographic information
systems and databases.
Feature-based datasets that are similar to OS MasterMap® have been produced
by several other national mapping agencies, especially within Europe. Examples
include the Vector25 digital landscape model of Switzerland7 and the Dutch
Top10vector product8. Differences between these products and the OS MasterMap®
Topography Layer that would have implications for the modelling processes
described here include:

• The scale at which the products are delivered: 1:25000 and 1:10000, for
the Swiss and Dutch products respectively.
• The associated simplification of their data models. For example, both the
Swiss and Dutch products contain a less detailed range of topographic
feature types relative to the OS MasterMap® Topography Layer.

It should be noted that the OS MasterMap® Topography Layer also differs from
the UK Land Cover Map 2000 developed by the Centre for Ecology and
Hydrology9. This is a parcel-based vector dataset derived through the processing of
satellite imagery that depicts land cover classes and was designed to provide a
census of broad habitat types in the United Kingdom10.
Additional Ordnance Survey data used in this work includes ADDRESS-
POINT®, a point-based dataset that identifies the precise geographical location of
residential, business and public postal addresses11, and OSCAR®, a national large-
scale vector dataset depicting road information12.

2.3 PREVIOUS APPROACHES TO LAND-USE CLASSIFICATION

As described by Wyatt13, between the mid 1980s and early 1990s, the (then)
Department of the Environment commissioned a series of studies14,15 to assess the
feasibility of a national land-use stock survey (interests in land use information are
now being taken forward by Communities and Local Government16). The main
conclusion from these studies was that land use should be collected and maintained
in collaboration with Ordnance Survey’s large-scale digital mapping. Following the
further recommendation by Dunn and Harrison17 that a national land-use stock
system should be created, additional evaluation by the Ordnance Survey18 in 1996
concluded that a desk-based method that used surveyors’ local knowledge, together
with large-scale Ordnance Survey data would be the most appropriate approach in

© 2008 by Taylor & Francis Group, LLC


A methodology for land-use classification 13

urban areas. For rural environments, stereo interpretation of color aerial


photography was the preferred method.
It has long been recognized that physically visiting land parcels to assess their
use is a most inefficient approach to developing and maintaining extensive land-use
maps, and that manual interpretation of high-resolution remotely-sensed images
offers a more effective and efficient procedure. Such an approach has the advantage
of utilizing human intelligence to understand the function of land parcels. The
operator intuitively studies the spatial context of a topographical feature within its
overall environment inferring, for example, that a building next to a garden is
probably residential, and one peripheral to a residential area and surrounded by a
car park might be retail. The obvious disadvantage to this approach is that it is very
time consuming when applied to anything beyond a local scale. Consequently,
there has been a history of research into increasingly automated methods of land-
use classification, both at Ordnance Survey and in the wider research community.
Given the difficulties in applying the intuitive rules of visual interpretation to
imagery in an automated manner, initial research at Ordnance Survey sought to
populate a land-use dataset using attributes drawn from a polygonized version of
Land-Line®, ADDRESS-POINT® and OSCAR®, in combination with intelligence
from third party sources19. The datasets that were proposed for use in future
incarnations of this methodology included: a remotely-sensed land cover data
source to populate land cover attributes, a range of commercial business directories
and the Valuation Office’s National Non-Domestic Rating List20 and Council Tax
Valuation List21 (these are databases that relate to local taxation and can be used to
aid differentiation between commercial and residential properties). The potential
for using remotely-sensed satellite data was also explored22. Remotely sensed
satellite data have the advantages over aerial photography of larger areal coverage
and higher revisit rates, but their contribution was limited at this time (1997) by the
non-availability of very high-resolution satellite imagery.
The data integration method proposed in the Ordnance Survey reports19,22,23 was
developed further through a research contract undertaken on behalf of the National
Land Use Database (NLUD®) Partnership24 that was placed by the then
Department for Transport, Local Government and the Regions (now Communities
and Local Government) with Infoterra Ltd. This study combined a limited number
of national coverage datasets, including those trialled in earlier Ordnance Survey
work19, to form a prior information dataset. The integration was performed using
rule-based searches and was known as the Semi-Automated Data Driven Analysis
(SADDA) technique1. This approach had two main elements, the first consisting of
data driven methods to directly classify polygons, and the second involving rules to
assign labels to unclassified polygons25. The latter inferences were based upon:

© 2008 by Taylor & Francis Group, LLC


14 GIS for environmental decision-making

• Clump analysis: following the classification of a group of buildings using


data-driven methods, any remaining adjacent polygons within the same
‘clump’ were assigned a similar use.
• Block analysis: Ordnance Survey OSCAR data were used to define blocks
of one land use and all unclassified polygons in such blocks were assigned
that use.
• Residential adjacency: polygons with an OS MasterMap® Topography
Layer ‘Make’ attribute (identifying features as man-made, natural, etc.) of
‘Multiple’ were classified as residential if they were next to other
residential polygons.

It was recognized, however, that these were relatively primitive methods and
Infoterra25 concluded that there was scope for more sophisticated contextual
analysis that was beyond the remit of their study.
A fundamental weakness in only using a ‘data integration’ approach to land-use
classification is that features are treated in a very individualistic manner, i.e.,
polygons will be classified only if a point from another data source falls within
them. Curtilage (i.e., land surrounding a building26) or other buildings on a large
site such as a hospital complex may be left unclassified. As a result, a number of
approaches have been adopted in remote sensing research to mimic the rules used
by a manual photo-interpreter to infer land use based upon relationships between
objects in the landscape.
At their simplest level, these image-based approaches have involved focusing
upon the immediate relationships between pixel values in an image. These ‘kernel-
based’ approaches involve studying pixel values within a convolution kernel and
investigating the spatial pattern of pixel values in terms of edge and vertex
adjacency events2. If pixel values are classified to represent land cover or material
types, the land use is then inferred from the spatial relationships between those
surface categories. However, capturing the spatial composition of high-resolution
imagery (< 5 m) using this method of analysis appears to be problematic. Large
kernels are required to better capture the spatial structure within such imagery, but
their application introduces the unwanted effects of blurring and smoothing and so
constrains the use of such techniques. A refinement, implemented upon medium-
resolution SPOT-HRV and Landsat-TM satellite data by Barnsley et al.2, is a
‘higher-level’ approach that takes into account better functional relationships
between pixels, combined with image texture models.
Increasingly sophisticated approaches have been used to identify contiguous
objects in imagery and then assess the relationships between them (as opposed to
pixel-by-pixel analyses). Initial attempts to analyze the contextual relationships
between objects included the application of graph-based structural pattern
recognition techniques. For instance, Barnsley and Barr3 developed the eXtended
Relational Attribute Graph (XRAG) that analyzed structural properties and the

© 2008 by Taylor & Francis Group, LLC


A methodology for land-use classification 15

relationships between parcels in rasterized digital maps, with the objective that the
methodology could also be applied to parcels in high-resolution satellite imagery.
Further research at Ordnance Survey27,28 studied the relationships between land
use and the morphological characteristics of objects in a topographic database.
Rules were constructed that examined both the shape of objects and the patterns in
the immediate vicinity of the objects. A qualitative assessment of the results
indicated that when discriminating between uses such as residential, roads, fields of
crops and field margins, a 60-70% attribution could be achieved28. However, the
rule base used in this case was considered inappropriate for classifying all land uses
since it concentrated upon the geometrical properties of individual polygons, rather
than also considering the spatial pattern of those features around the entity of
interest. It was concluded from this study, as well as the experience of being
involved in research such as the CLEVER-mapping project19,29, that the integration
of raster and vector data and the attributes of third-party datasets all show potential
for classifying land use.
In recent years commercially available object-oriented image analysis software
such as eCognition30 has enabled imagery to be easily segmented and facilitated the
implementation of more complex classification rules based upon a range of
morphological and relational characteristics. This type of methodology has been
applied to land-use classification (e.g., Bauer and Steinnocher31). The main
difficulty in applying a segmentation-based technique in a mapping context is the
likelihood of incompatibilities between objects in the imagery and those in existing
topographic data. These limitations in image-led object-based methodologies
primarily result from the characteristics of image-dependent segmentation
algorithms. By their very nature, segmentation routines are usually based upon low-
level image analysis techniques designed to be an initial step in object
recognition32. As such, due to the inherent heterogeneity of remotely-sensed data,
factors such as differences in illumination (e.g., across a roof apex) will result in
segments that do not fully match the desired topographic features. Indeed, with
respect to applying segmentation routines for large-scale mapping purposes,
although there has been over two decades of research in this area, there is no
universal routine available that can extract the geometries of specific types of
topographic feature in a variety of contexts33. That said, methods for deriving
information from imagery for mapping purposes are increasing in their
sophistication, and in their use of additional knowledge (such as other spatial
information), to better target processing routines34. This is especially true if these
routines are applied to the revision of existing information. Nevertheless, despite
these advances, drawing upon initial experimentation in work that used the outline
of topographic features in the OS MasterMap® Topography Layer to segment
recently flown aerial photography, it became apparent that there were too many
factors affecting the integrity of the resulting land-use dataset. As a consequence,
the methodology outlined below does not depend upon remotely-sensed data.

© 2008 by Taylor & Francis Group, LLC


16 GIS for environmental decision-making

2.4 ORDNANCE SURVEY CLASSIFICATION METHODOLOGY

The research at Ordnance Survey sought to produce an automated methodology


that would improve the quality of a land-use dataset initially populated through data
integration techniques by using morphological and relational rules to increase both
completeness and usability. To help avoid confusion between techniques the latter
rule-based element has been given the acronym OOLUC (object-oriented land-use
classification), with the initial development utilizing OS MasterMap® Topography
Layer and ADDRESS-POINT® data35. OOLUC is essentially a fully-automated
rule base that applies rules in a cyclical manner, classifying objects into functional
groups, and then reclassifying objects, labelled with functional classifications, in
accordance with the NLUD classification scheme. For example, a school building
and a playground would be classified separately (with a playground being classified
using its land cover and proximity to a school), before both objects become
reclassified as ‘Education’.
Rule-based procedures require many assumptions. For instance, in Cassettari36
retail areas are assumed to take precedence over other types of town center land
use. In cases of low classification certainty, OOLUC incorporates similar rules that
assess the relative area of different classifications surroundings a polygon. For
example, if an unclassified polygon has a high proportion of properties labelled as
‘Retailing’ nearby, then it will be assigned a similar status. However close
inspection of the results produced by OOLUC indicates that although these rules
lead to an increase in the completeness of the classification, there are high numbers
of misclassifications between classes such as ‘Retailing’, ‘Offices’ and ‘Industrial’.
In addition, the presence of some imprecisely defined land cover classes (for
instance water ditches in urban areas) results in misclassification by relational rules.
Consequently, when only using OOLUC there is a lower assurance that polygons
are classified correctly.
SADDA (see Section 2.3) was developed from a body of research that took
place at Ordnance Survey through the 1990s. The overall Ordnance Survey land-
use classification methodology (OSLUM) consists of applying a SADDA-like
technique followed by OOLUC. Such an approach acknowledges that an expanded
range of third-party datasets are required to differentiate between classes that
otherwise have a low confidence of classification, and that an optimal methodology
should consist of both data integration techniques (e.g., those in SADDA) and a
rule base (e.g., OOLUC). It should be noted that to avoid any inconsistencies that
might arise after applying SADDA’s inferred methods, only the data-driven part of
SADDA (which employs the integration techniques) is used within OSLUM.
Figure 2.1 illustrates the relationship between the SADDA and OOLUC
components of the OSLUM methodology. Within the OSLUM flowline, OOLUC
takes, as input data, OS MasterMap® Topography Layer with attributes for land
cover and land use that have been assigned during the SADDA procedure. Both

© 2008 by Taylor & Francis Group, LLC


A methodology for land-use classification 17

SADDA and OOLUC have been designed to populate the OS MasterMap®


Topography Layer in accordance with the NLUD classification scheme from
version 4.1 onwards1. In these versions of the NLUD classification, a polygon
should be assigned both a land use and land cover attribute. However, OOLUC is
only designed to provide land use attributes and so it is assumed that SADDA, on
its own, is sufficient to generate land cover information. Furthermore, polygons
with a land use attribute from SADDA do not have this information changed by
OOLUC, since priority is given to the direct, rather than inferred, method.

Figure 2.1 The structure of the Ordnance Survey land-use classification methodology (OSLUM).

Within the OOLUC component, polygons that have not been classified using
SADDA are populated according to the following sequence of rules:

© 2008 by Taylor & Francis Group, LLC


18 GIS for environmental decision-making

1. Direct attribution.
2. Adjacent to directly-attributed object and fulfills certain morphological
criteria.
3. Within a defined distance of a directly-attributed object and fulfills certain
morphological criteria.
4. Fulfills defined morphological criteria and adjacent to another object of
defined morphological criteria.
5. Fulfills defined morphological criteria and within a defined distance of
another object of defined morphological criteria.
6. Fulfills defined morphological criteria.

To ensure stability in the resultant classification, after empirical investigation


during development of the methodology, it was concluded that the rule base should
be applied to each polygon four times. The order of precedence during each of the
cycles is recorded as an extra attribute for each polygon. Using this information, the
operator is able to obtain a confidence measure for the classification of each
polygon. After the program has finished applying the rule-base cycles, a further
rule is implemented that deals with curtilage and any remaining unclassified
polygons to ensure completeness in the final dataset. Example land use maps that
result from implementing just SADDA and the complete OSLUM methodology are
shown in Figures 2.2 and 2.3 respectively. The illustrations also demonstrate that
the OSLUM method provides a greater completeness in the resulting dataset
To summarize, Table 2.1 compares the characteristics of OSLUM with those of
the alternative methods for land-use classification described in Section 2.3. This
evaluation suggests that OSLUM incorporates many favorable characteristics of
previously documented techniques (e.g., an object-based focus and no requirement
for direct use of imagery). Overall, the key advantages of OSLUM are that it:

• Preserves the high confidence of correct classification of individual


polygons (if suitable data are available) derived from a data integration
(data-driven) component, and
• Increases the value of the final dataset by populating objects that do not
have other data associated with them through the use of relational and
morphological modelling.

2.5 ACCURACY ASSESSMENT OF SADDA AND OSLUM

A quantitative comparison was conducted to compare the SADDA and OSLUM


methodologies described in the previous section. Reference data were collected for
nine study sites (each between 1–2 km2 in area) in both urban and rural
environments. These data were used to validate results from SADDA by
Infoterra25. Analysis of the OSLUM results was performed on the same sites.

© 2008 by Taylor & Francis Group, LLC


A methodology for land-use classification 19

Figure 2.2 A land-use classification of central Sheffield using the SADDA methodology. Polygons
depicted in white are unclassified. Ordnance Survey data © Crown Copyright. All rights reserved.

© 2008 by Taylor & Francis Group, LLC


20 GIS for environmental decision-making

Figure 2.3 A land-use classification of central Sheffield using the OSLUM methodology. Polygons
depicted in white are unclassified. Ordnance Survey data © Crown Copyright. All rights reserved.

© 2008 by Taylor & Francis Group, LLC


A methodology for land-use classification 21

Table 2.1 A comparison of methodologies for land-use classification

Land-Use
Classification Pixel or Rules to Take
Methodology Input Data Object-Based Account of Context
Manual Remotely-sensed Intuitively object- Intuitive assumptions
interpretation imagery based
Data integration Feature-based vector Object-based None
data, raster land cover
data, and a range of
point-based spatial
datasets
Kernel-based Remotely-sensed Pixel-based Vertex and edge
imagery adjacency events
Graph-based Rasterized digital map Object-based Distance and direction
imagery relationships
Early morphological Polygonized vector Object-based Immediate adjacency
techniques data counts
Image-based Remotely-sensed Object-based Morphological and
object-oriented imagery spatial relationship rules
methodologies between objects derived
solely from imagery
Ordnance Survey Feature-based vector Object-based Morphological and
land-use data, raster land cover spatial relationship rules
classification data, and a range of between objects derived
methodology point-based spatial from object-based data.
(OSLUM) datasets Non-visual prior
information from point-
based data

On a per-class basis, Table 2.2 (urban sites) and Table 2.3 (rural sites)
summarize accuracy values for both the SADDA (using only direct attribution and
then this followed by inferred methods) and the OSLUM methodologies. The
accuracy values listed represent percentages of total areas. Based upon the original
SADDA results from Infoterra25, there are two types of accuracy value provided:

• Confidence (also known as user’s accuracy): The probability that an area


classified with a given class actually represents that class on the ground.
• Class Completeness (also known as producer’s accuracy): The proportion
of total area in each class of reference data that is classified correctly.

© 2008 by Taylor & Francis Group, LLC


22 GIS for environmental decision-making

Table 2.2 Total per-class accuracy figures for urban reference sites

Confidence (% Area) Class Completeness (% Area)


SADDA SADDA SADDA SADDA
NLUD (data - (including (data- (including
Classification driven inference driven inference
(Version 4.1) only) techniques) OSLUM only) techniques) OSLUM
1.1 Agriculture 52.78 52.77 52.78 91.21 91.21 91.21
1.2 Agric. bldg. 0.00 8.94 0.00 0.00 18.92
2.1 Forest/wood 12.45 12.45 12.45 29.33 29.33 29.33
2.2 Open land 44.51 0.00 28.87 14.79 0.00 28.28
2.3 Water 88.71 88.71 88.71 99.27 99.27 99.27
3.1 Mineral/quarry
3.2 Landfill site
4.1.1 Recreation 75.69 75.14 74.69 43.40 44.44 43.39
4.1.2 Allotment 100.00 100.00 100.00 85.60 85.60 85.60
4.2 Leisure bldg. 75.28 74.28 76.66 33.50 33.70 37.48
5.1.1 Highway 89.77 88.77 88.77 80.10 80.11 80.11
5.1.2 Car park 28.17 28.17 28.42 7.68 7.68 7.93
5.2.1 Railway 88.69 88.69 88.69 94.11 94.11 94.11
5.2.2 Airport
5.2.3 Dock 0.00 0.00
5.3 Utilities 0.00 14.27 15.84 18.75 16.68 20.89
6.1 Residential 90.44 95.38 86.84 39.30 86.96 86.30
6.2 Institutional 7.18 29.06 7.18 1.18 0.00 1.18
7.1.1 Inst. bldg. 85.35 83.92 81.42 23.67 18.95 27.35
7.1.2 Educat. bldg. 100.00 100.00 100.00 23.88 8.50 23.88
7.1.3 Relig. bldg. 100.00 100.00 100.00 4.26 4.26 7.53
8.1 Industry 31.78 27.69 28.03 10.70 11.71 16.49
8.2 Offices 48,56 48.63 40.12 30.47 31.22 31.36
8.3 Retailing 64.31 83.41 43.15 43.72 28.98 68.49
8.4 Warehousing 25.84 26.4 14.11 15.77 16.11 22.68
9.1.1 Vacant land 0.00 0.00 0.00 0.00
9.1.2 Vacant bldg. 0.00 0.00 0.00 0.00
9.2 Derelict land 0.00 0.00 0.00
10.1 Defense

The land-use classification scheme employed in the assessment was the NLUD
Classification v4.1 (see Harrison1 for further information). In Tables 2.2 and 2.3, if
no accuracy value is shown then there was no population of the class in question,

© 2008 by Taylor & Francis Group, LLC


A methodology for land-use classification 23

while if a value of 0.00 is listed then that class was populated with entirely
incorrect classifications.

Table 2.3 Total per-class accuracy figures for rural reference sites

Confidence (% Area) Class Completeness (% Area)


SADDA SADDA SADDA SADDA
NLUD (data - (including (data- (including
Classification driven inference driven inference
(Version 4.1) only) techniques) OSLUM only) techniques) OSLUM
1.1 Agriculture 98.13 98.14 98.14 92.95 92.95 92.95
1.2 Agric. bldg. 82.50 82.5 74.96 0.18 0.18 56.25
2.1 Forest/wood 64.81 64.81 64.81 86.91 86.94 86.92
2.2 Open land 77.46 77.46 77.40 86.79 86.79 86.79
2.3 Water 20.10 20.11 20.11 94.38 94.38 94.38
3.1 Mineral/quarry 0.00 0.00 0.00
3.2 Landfill site
4.1.1 Recreation 0.00 0.00 0.00
4.1.2 Allotment
4.2 Leisure bldg. 30.03 22.96 20.51 97.60 97.60 97.60
5.1.1 Highway 79.80 79.80 79.80 84.38 84.38 84.38
5.1.2 Car park 0.00 0.00 0.00
5.2.1 Railway
5.2.2 Airport
5.2.3 Dock
5.3 Utilities 68.45 68.45 93.55 0.02 0.02 0.15
6.1 Residential 63.13 89.72 84.01 4.13 38.91 52.46
6.2 Institutional 0.00 0.00 0.00 0.00 0.00 0.00
7.1.1 Inst. bldg. 61.73 54.50 73.19 16.05 16.05 57.38
7.1.2 Educat. bldg. 100.00 100.00 100.00 6.55 6.61 6.61
7.1.3 Relig. bldg. 100.00 100.00 100.00 8.51 8.59 9.24
8.1 Industry 100.00 100.00 100.00 1.55 1.55 2.93
8.2 Offices 0.00 0.00 0.00
8.3 Retailing 59.28 44.99 20.38 12.62 13.29 56.19
8.4 Warehousing 0.00
9.1.1 Vacant land 0.00 0.00 0.00 0.00
9.1.2 Vacant bldg. 0.00 0.00 0.00
9.2 Derelict land 0.00
10.1 Defense

© 2008 by Taylor & Francis Group, LLC


24 GIS for environmental decision-making

The results for urban areas in Table 2.2 indicate that including inferred methods
improved the Class Completeness for several NLUD categories characterized by
relatively large ‘blocks’ with multiple buildings and surrounding land. For
example, using both the ‘full’ SADDA approach and the OSLUM methodology
dramatically increased the Class Completeness of the 6.1 Residential category
when compared to the solely data-driven method. This trend is further
demonstrated in the case of OSLUM with higher completeness values for 7.1.1
Institutional Building, 7.1.2 Educational Building and 7.1.3 Religious Building,
along with commercial classes such as 8.3 Retailing and 8.4 Storage and
Warehousing. For classes 8.1 Industry, 8.3 Retailing and 8.4 Storage and
Warehousing, the OSLUM confidence value is lower than those produced by both
SADDA methods. Even when compared to the full SADDA method, this is
explained by even more inferred rules being used in OSLUM to associate
topographic objects with others in order to assign a classification in situations
where associated data are sparse.
In the rural environment, Table 2.3 shows that the Class Completeness for 6.1
Residential and 8.3 Retailing was improved through the association of polygons
with surrounding land uses that had already been identified. A particular increase in
Class Completeness is also apparent for 1.2 Agricultural Buildings using OSLUM.
The difference between the full SADDA and OSLUM results for this category
reflects a better representation of the context within which such buildings are set in
the rules of the latter. In OSLUM, an agricultural building is differentiated from a
residential one by stipulating that it should not be adjacent to a garden and that it
should be surrounded by a high proportion of natural land cover.
The rules that contribute to increased Class Completeness within the rural
setting also tend to produce a slight fall in the corresponding Confidence levels
(e.g., see the results for 8.3 Retailing). One exception is the 5.3 Utilities class
where the Confidence level actually increases when using OSLUM. This could be
due to improved contextual rules (e.g., land owned by a water company being close
to a reservoir). However mistakes in SADDA, such as classifying reservoirs into
category 2.3 Water (derived directly from the land-cover attribute) and not 5.3
Utilities, are also propagated into OSLUM. This helps to explain the extremely low
values of Class Completeness for the 5.3 Utilities category in all methodologies. In
addition, it suggests that a future version of the rule base may need to include an
extra condition that classifies 2.3 Water as 5.3 Utilities if it is adjacent to a 5.3
Utilities object in a previous classification cycle.
Other features of the results generated by OSLUM are as follows:

• A vacant building in a rural environment was usually put into class 1.2
Agricultural Building by OSLUM. In an urban area it was usually
classified as 8.1 Industry or 8.4 Storage and Warehousing.

© 2008 by Taylor & Francis Group, LLC


A methodology for land-use classification 25

• If an 8.1 Industry object was incorrectly classified, it was usually defined


as 8.4 Storage and Warehousing.
• Car parks in reference data that are associated with sites such as retail
parks were sometimes classified as 5.1.2 Car Park and not as 8.3 Retailing.

Table 2.4 presents the overall accuracies of the methods tested in this study.
The higher accuracy values in both urban and rural environments indicate that
applying the OOLUC rule base after the data-driven component of SADDA really
does add extra value to a method that only directly populates polygons with
existing data. In addition, the results illustrate the effect of the more sophisticated
contextual inference rules in OSLUM, compared to those in the full version of
SADDA. For example, OSLUM leaves fewer polygons unclassified and the better
accounting for curtilage contributes to the higher overall accuracy of the method in
urban areas. The very small differences between the results in rural areas can be
attributed to the presence of a higher proportion of classes that can be more reliably
populated using existing land-cover classifications (e.g., 1.1 Agriculture).

Table 2.4 Overall accuracies of the SADDA and OSLUM classification methodologies

SADDA SADDA
(data-driven (including inference
approach only) techniques) OSLUM
Urban 46.87% 53.25% 59.87%
Rural 87.55% 88.04% 88.34%

2.6 CONCLUSIONS

This chapter has outlined and evaluated a semi-automatic approach to


populating an initial baseline land-use dataset. The aim was to advance a
methodology that couples techniques for deriving land-use function from
morphological and spatial rules, with information from existing geographic datasets
utilizing solely semi-automated techniques. OSLUM fulfills these criteria, and
enhances previous approaches to classifying land-use in a semi-automated manner.
The results from evaluating OSLUM suggest that for many land-use classes it
provides higher levels of per-class completeness and levels of total population
(especially within urban areas) than when using the SADDA method on its own.
The main advantage of OSLUM compared to other methods is that the functional
characteristics of land use are assessed in terms of both visual context (such as in
the pattern of topographical objects on a map) and non-visual information (such as
address, or business-directory related information). Since these contextual rules are
only applied to polygons that are unclassified after applying a direct data-driven

© 2008 by Taylor & Francis Group, LLC


26 GIS for environmental decision-making

approach, higher classification accuracies are bound to arise. In addition, unlike the
inference rules employed in the full version of SADDA, the contextual rules in
OSLUM are applied in a fully automatic manner in all types of environment.
OSLUM’s shortcomings include the expense and availability of data required in
the data-driven component and its dependence upon the detail that is present in
large-scale topographic data for morphological modelling. Although, in theory, a
similar methodology could be applied to other countries that possess a national
feature-based topographic dataset, it is likely that the use of a generalized base
dataset could have a detrimental effect upon the accuracy and completeness of the
results. For some users, the Class Completeness and Confidence values of some
individual classes in OSLUM might be too low and this may present a further
disadvantage of the methodology. However, this could be improved with
subsequent manual interpretation and intervention. The value of OSLUM lies in
maximizing the accuracy of a land-use dataset that has been populated using
automated methods, thereby minimizing cost in any subsequent intervention or
maintenance processes.
The analysis of results produced by OSLUM indicates that the method provides
improvements in the overall accuracy and completeness of a land-use dataset
produced using close to fully automatic methods, most notably within the urban
environment. OSLUM therefore offers one way forward to solve the inherently
multi-faceted problem of effective and complete population of a land-use dataset.

2.7 ACKNOWLEDGMENTS

This chapter is © Crown Copyright 2007. Reproduced by permission of


Ordnance Survey.
The authors are grateful to John Goodwin for his work on the initial data
integration and morphological methodologies developed at Ordnance Survey, and
to Tristan Wright, Nicholas Regnauld and David Russell for their assistance in the
development and analysis of OOLUC.
This chapter has been prepared for information purposes only. It is not designed
to constitute definitive advice on the topics covered and any reliance placed on the
contents of this article is at the sole risk of the reader.

2.8 REFERENCES

1.
Harrison, A., Extending the dimensionality of OS MasterMap™: land use and land cover, Presented at
the AGI Conference at GIS 2002, Available at http://www.nlud.org.uk/draft_one/key_docs/pdf/
Harrison_AGI02.pdf, 2002.
2.
Barnsley, M.J., Moller-Jensen, L., and Barr, S.L., Inferring urban land-use by spatial and structural
pattern recognition, in Remote Sensing and Urban Analysis, Donnay, J-P., Barnsley, M.J., and Longley,
P.A., Eds., Taylor and Francis, London, 2000, 115-144.

© 2008 by Taylor & Francis Group, LLC


A methodology for land-use classification 27
3.
Barnsley, M.J. and Barr, S.L., Distinguishing urban land-use categories in fine spatial resolution land-
cover data using a graph-based, structural pattern recognition system, Computers, Environment and
Urban Systems, 21, 209-225, 1997.
4.
Ordnance Survey, Welcome to OS MasterMap, http://www.ordnancesurvey.co.uk/oswebsite/products/
osmastermap, 2006.
5.
Ordnance Survey, Land-Line® User Guide, Ordnance Survey, Southampton, 2002.
6.
Murray, K., A new geo-information framework for Great Britain, in Proceedings of FIG XXII
International Congress, Washington DC, 2002, 1/13.
7.
Swisstopo, Vector25, http://www.swisstopo.ch/en/products/digital/landscape/vec25/, 2006.
8.
Kadaster, Top10vector, http://www.kadaster.nl/zakelijk/producten/topografische_dienst_top10vector.
html, 2003.
9.
Centre for Ecology and Hydrology, Land Cover Map, http://www.ceh.ac.uk/sections/seo/
lcm2000_home.html, 2007.
10.
Fuller, R.M., Smith, G.M., Sanderson, J.M., Hill, R.A., and Thomson, A.G., The UK Land Cover
Map 2000: Construction of a parcel-based vector map from satellite images, Cartographic Journal, 39,
15-25, 2002.
11.
Ordnance Survey, ADDRESS-POINT®, http://www.ordnancesurvey.co.uk/oswebsite/products/
addresspoint, 2006.
12.
Ordnance Survey, OSCAR® User Guide, Ordnance Survey, Southampton, 2005.
13.
Wyatt, P., Creation of an Urban Land Use Database, RICS Education Trust, Bristol, 2002.
14.
Roger Tym and Partners, National Land Use Stock Survey: A Feasibility Study for the Department of
the Environment, Roger Tym and Partners, London, 1985.
15.
Dunn, R. and Harrison, A., A feasibility study for a national land use stock survey, in Proceedings of
the AGI Conference 1992, Association for Geographic Information, London, 1992, 1.13.1-1.13.4.
16.
Communities and Local Government, Planning, building and the environment, http://www.
communities.gov.uk/index.asp?id=1503250, 2007.
17.
Dunn, R. and Harrison, A., Working towards a national land use stock system, in Prooceedings of the
AGI Conference 1994, Association for Geographic Information, London, 1994, 8.1.1-8.1.5.
18.
Ordnance Survey, Research Trial for a National Land Use Stock System, Report to the Department of
the Environment, Ordnance Survey, Southampton, 1996.
19.
Ordnance Survey, Population & Maintenance of a Land Use/Cover Database of England (Stage 1),
Internal Report, Ordnance Survey, Southampton, 1997.
20.
Valuation Office Agency, Business Rates - general information, http://www.voa.gov.uk/
business_rates / index.htm, 2005.
21.
Valuation Office Agency, Council Tax Valuation List – glossary, http://www.voa.gov.uk/cti/
MGlossary.asp, 2005.
22.
Ordnance Survey, Population & Maintenance of a Land Use/Cover Database of England (Stage 2),
Internal Report, Ordnance Survey, Southampton, 1997.
23.
Ordnance Survey, Population & Maintenance of a Land Use/Cover Database of England (Stage 3),
Internal Report, Ordnance Survey, Southampton, 1997.
24.
National Land Use Database, http://www.nlud.org.uk/, 2006.

© 2008 by Taylor & Francis Group, LLC


28 GIS for environmental decision-making
25.
Infoterra, Final Report for Research Contract CP01004: NLUD County Demonstrator, Infoterra
Limited, Farnborough, 2002.
26.
ACSM, Glossary of Mapping Sciences, American Congress on Surveying and Mapping, American
Society for Photogrammetry and Remote Sensing, and American Society of Civil Engineers, Maryland
and New York, 1994, 133.
27.
Ordnance Survey, Land Use/Cover Class Signatures, Internal Report, Ordnance Survey,
Southampton, 1998.
28.
Ordnance Survey, NLUD by Integrated Analysis of Datasets, Internal Report, Ordnance Survey,
Southampton, 1999.
29.
Smith, G.M., Fuller, R.M., Amable, G., Costa, C., and Devereux, B.J., CLEVER-mapping: An
implementation of a per-parcel classification procedure within an integrated GIS environment,
Proceedings of the Remote Sensing Society conference, Observations and Interactions: RSS97, Remote
Sensing Society, Nottingham, 1997, 21-26.
30.
Baatz, M., Heynen, M., Hofmann, I., Lingenfelder, M., Mimler, A., Schape, M., Weber, M., and
Willhauck, G., eCognition User Guide, Definiens AG, Munich, 2000.
31.
Bauer, T. and Steinnocher, K., Per-parcel land-use classification in urban areas applying a rule-based
technique, GIS, 6, 25-27, 2001.
32.
Pal, N.R. and Pal, S.K., A review on image segmentation techniques, Pattern Recognition, 26, 1277-
1294, 1993.
33.
Agouris, P., Mountrakis, G., and Stefanidis, A., Automated spatiotemporal change detection in digital
aerial imagery, in Proceedings of Aerosense 2000, SPIE Proceedings Vol. 4054, Orlando, Florida, 2000,
2-12.
34.
Baltsavias, E. P., Object extraction and revision by image analysis using existing geodata and
knowledge: current status and steps towards operational systems, in Proceedings of ISPRS Commission
II Symposium, Xian, China, 2002, 13.
35.
Ordnance Survey, Land Use Classification of OS MasterMap, Internal Report, Ordnance Survey,
Southampton, 2002.
36.
Cassettari, S., Land use mapping – the GeoInformation Group’s approach, Geomatics World, 6, 40-
41, 2002.

© 2008 by Taylor & Francis Group, LLC

You might also like