Forlani 2012
Forlani 2012
net/publication/255666692
CITATIONS READS
17 180
2 authors:
All content following this page was uploaded by Gianfranco Forlani on 19 November 2014.
Commission III, WG 3
ABSTRACT
Filtering non-terrain points from raw laser scanning data is the most important goal to improve productivity in DTM generation.
Filtering algorithms are built on assumptions about what discriminates terrain points from points on other objects (e.g. buildings and
vegetation). In most cases, a single measure is used to accept or reject points. In this paper a three-stage raw data classification
algorithm is presented. After a preliminary interpolation to a grid, a region growing based on height differences is applied. Segments
from the region growing are classified as terrain, building or vegetation, based on their geometric and topological description.
Terrain grid cells are conditionally low-pass filtered, to remove low vegetation. A piece-wise approximation of the terrain surface is
computed, built from the grid cells classified as terrain. Finally, raw data are accepted as terrain within a given distance from the
surface. Results obtained on a ISPRS filter test data set are shown to illustrate the effectiveness of the procedure.
1
Currently visiting professor at Delft University of Technology, Dept. of Earth Observation and Space Systems
130
IAPRS Volume XXXVI, Part 3 / W52, 2007
segments based on consistency of normal vectors, distance to their relationships and provides the contextual information
the fitting plane and distance from seed point; robust filtering of essential to increases the probability of correct classification of
the surface is then applied, where the same weights are applied single data point in the final stage. This is not to claim that the
to group of points, rather than to single points. method is error free, but rather that a segment-based approach
In this paper a strategy for the classification and filtering of raw (as in feature-based matching) is more robust that just relying
laser scanning data is presented. The main building blocks of on point-to-point comparison in a local neighborhood (as in
the strategy (namely, data segmentation by region growing and signal-based matching). Effective filtering cannot be separated
region classification) have already been presented respectively by some sort of object recognition and identifying terrain
in (Nardinocchi and Forlani, 2001) and (Nardinocchi et al, patches or trees should not be seen as different from detecting
2003). In (Forlani et al, 2006) the capability of the method in buildings.
building detection was demonstrated, in the context of building The final stage relies completely on the correctness of the
reconstruction from laser data. In the following, Section 3 classification of the cells labeled as terrain, since the overall
presents the main features of the strategy; Section 4 reviews the approximation of the terrain is obtained only from cells
segmentation and the classification, pointing to the changes classified as terrain. Some classification errors can be tolerated:
now introduced to earlier versions and showing the small patches of low vegetation labeled as terrain are filtered
improvements. Section 5 presents the raw data filtering, that out; buildings labeled as terrain, on the contrary, will not.
was just sketched in the previous papers. Finally, Section 6 On the other hand, the further a cell is from the nearest terrain
reports on the results. Examples and results refer to site 5 of the region (or the less the terrain cells), the smaller the probability
ISPRS laser scanning test dataset (Sithole and Vosselman, that the approximating surface will truly follow the terrain and
2003). so actually will help to correctly discriminate the point class.
Data interpolation, segmentation and classification have to find
3. OUTLINE OF THE METHOD the best compromise between correct labeling of the terrain
regions and the attempt to extend them as much as possible, in
The classification strategy comprises three-stages (see Figure order to penetrate into the high vegetation areas and to reduce
1). In the first one, raw data are interpolated to a grid, taking the the number of small patches of terrain that, if completely
lowest elevation in the cell as grid value. surrounded by vegetation, would be much more difficult to
In the second stage, grid data are segmented by a region classify reliably.
growing algorithm with adaptive threshold. The geometric
characteristics and the topological relationships among the 4. DATA INTERPOLATION, SEGMENTATION AND
segments are reconstructed and, based on a set of rules, the CLASSIFICATION
segments are classified as outliers, vegetation, building or
terrain. Although each cell was assigned to a class, the raw data In the following paragraphs the three stages of the strategy are
it contains must still be classified individually. reviewed, highlighting the changes introduced with respect to
In the third and last stage of the procedure, the whole set of raw earlier versions and the improvements obtained. Attention is
data is examined. For the former, consistency is measured with also paid to using First and Last pulse and how positive and
respect to the elevations of the neighbouring terrain cells. For negative outliers are dealt with.
the latter, a piecewise approximation of the terrain with a The behaviour of the procedure is exemplified on the Site 5
continuous surface is estimated using data from cells classified dataset of the ISPRS Laserscanning test which offers a great
as terrain; consistency is measured thresholding the distance variety of environments, with step edges in the terrain, slopes
from the surface. with different orientation, high vegetation on a steep hillside
and a built up area with vegetation with a relatively low density
Raw Lidar Data
of raw data.
Adaptive region growing Slope-based Segmentation Grid cells are assigned the elevation of the lowest raw point in
segmentation Gradient Orientation the cell (see Figure 2). The larger the grid size, the more likely
Segmentation
this prevents the noise (such as cars, low trees and so on) to
Geometrical and Knowledge base affect the aggregation process. On the other hand, increasing it
topological relationships
(a)
Point-based filtering of Point-based filtering of low
high vegetation vegetation and noise
Figure 1. Components and main relationships of the framework for Figure 2. (a) Raw data Points; (b) Grid data Points
LIDAR data filtering. Solid lines refer to processing of grid
data; dashed lines to processing of raw data too much will affect the extraction of detailed information
(slope, aspect, …) from the grid, which may hamper the
Aggregation of raw data in segments enables a richer effectiveness of further steps. The best grid size should be
description of geometric properties and the establishment of between one or two times the raw data point spacing.
topologic relationships. This makes it possible reasoning about
131
ISPRS Workshop on Laser Scanning 2007 and SilviLaser 2007, Espoo, September 12-14, 2007, Finland
Empty grid cells are treated as no data, unless all 8-neighbours For a given roof slope, the larger the cell size or the lower the
are non-empty: in this case, the cell value is set to the median of point density, the likelier was a fragmented segmentation.
the 8-neighbours. The same may happen with very steep terrain although, as
Figure 3 shows the TIN representation of the raw data (left) and already pointed out, in such cases the aggregation may come
of the grid data (right) in a smooth forest area. It is apparent from a smoother adjacent terrain area.
that, due to the high penetration rate, in such cases the grid Indeed, the region growing threshold should be coupled to the
representation already constitutes a good, although noisy, grid cell size and should also take into account evidence of
approximation of the terrain. surface continuity in the neighborhood. Several changes have
been made to the original implementation of the method to
address this problem. The region growing algorithm is now
steered by both the gradient orientation of the grid heights and
the slope. The seed pixels of the region growing algorithm are
chosen from regions larger than 30 m2 with homogeneous
gradient orientation while the threshold value is adaptively
adjusted to the slope of the region. Morevor, the process starts
from the regions with the lowest threshold value.
Figure 3. TIN of the raw data e TIN of the grid data on a wooded area. In large regions with homogeneous gradient orientation the
computation of the threshold will not be affected by vegetation.
4.2 Grid Data Segmentation
132
IAPRS Volume XXXVI, Part 3 / W52, 2007
4.3 Data Classification contrary, terrain pixels erroneously labeled as building might be
recovered in the last stage. Figure 8 shows the result of grid
Geometric characteristics of the regions and their topological
data classification.
relationships are computed and stored in a knowledge base. A
rule-based scheme is applied to classify the regions: the 4.4 Using First and Last Pulse
outcome of the process labels each region as vegetation,
Almost every laser scanner today provides first and last (F&L)
building, terrain, outlier or unclassified (the last item tipically
pulse returns; the pattern of their difference is of great help in
being 1÷3% of the area size). Actually, each class may have
identifying vegetation. This is very important to improve both
sub-classes (e.g. courtyard as part of terrain); among
data classification as well raw data filtering: the percentage of
unclassified regions, narrow regions are defined as those
grid points in a region where F&L pulse elevations differ is
slender in shape. Points on high rise chimneys, towers, power
used to help the identification of terrain; raw data filtering (in
line poles, etc may be classified as outliers or buildings,
terrain as well as non-terrain areas) can be robustified by this
depending on shape, point density and cell size. Currently, no
information (see Section 5).
rules discriminate bridges, that are therefore included in the
terrain.
4.5 Outliers
Outliers in laser data are either “negative” (i.e. points below the
surface, mostly due to multi-path) or “positive” (i.e. points
above the surface, such as hits on birds, power cables, etc). The
segmentation makes the classification insensitive to single cells
with positive or negative outliers in two ways: if the outlier is
Figure 7. The most significant regions of the grid data segmentation by the only point in the cell, it will be put in a 1-pixel region and
the adaptive threshold classified as outlier. If there are several points in the cell, some
outliers some not, the positive outliers will be recognized in the
The current set of rules has been drawn from simple models of
final filtering stage, because they are higher than the
characteristics and relationships between terrain, building and
neighbourhood, whatever the class the cell was assigned. With
vegetation. The complexity of the task means that robustness of
negative outliers, the pixel has been labeled as outlier from the
the rule set cannot be taken for granted and that more rules
grid classification; other points of the cell may be assigned to
might have to be invoked in new scenarios. Most
terrain or vegetation, depending on the distance from the
misclassification errors occur with trees labeled as buildings,
approximating surface.
buildings as terrain and terrain as buildings. The worst
Even in case several contiguous cells contain outliers, it is very
misclassification error is a building included in the terrain,
unlikely that they end grouped in a region, because this would
because it will not be corrected in the next stage; on the
133
ISPRS Workshop on Laser Scanning 2007 and SilviLaser 2007, Espoo, September 12-14, 2007, Finland
134
IAPRS Volume XXXVI, Part 3 / W52, 2007
135