The 3D point clouds obtained from the different sources are first binned on to a 2D grid (similar to a Digital Elevation Model) as shown in
Figure 6b. Each cell
C of the grid (constant size of 0.5 ×
m) is defined by its center and the maximum and minimum elevation values, based on the 3D points contained, as expressed in (
1).
The 3D points falling into each cell of this elevation map are stored and will be merged to form objects after the clustering phase.
3.2.2. Clustering
Once the ground points are removed from the cells, we cluster the cells together to form larger super-cells. Any two cells are clustered together to form a new larger cell and their 3D points are merged if the following condition is satisfied:
Here and are the geometrical centers of the 3D points contained in the and cell respectively and is the Euclidean distance. is given as: and is the cell size along the X and Y directions. These cell sizes are initialized at the initial grid-cell size of m, however, the new cell size then varies along the X and Y directions depending upon which axis the cells merge.
The 3D points of these two cells are merged together and values of ), , and are updated. An overview of the algorithm is presented in Algorithm 1. This modified grid (with different sized cells) now represents a collection of potential tree objects. These objects are then segmented/classified as tree trunks and vegetation in each cluster as explained in the next section.
Algorithm 1 Clustering. |
input Grid of 2D cells |
repeatSelect a 2D cell for clustering Find all neighboring cells satisfying ( 2) to include in the cluster Merge these cells to form a cluster Re-calculate Update values of , , and
|
until All 2D cells in the grid are used |
return New updated grid |
3.2.3. Tree Segmentation
The 3D points contained in each of these super-cells are classified into two classes: {
Tree trunk,
Vegetation}. The
Tree trunk consists of the actual trunk visible in the 3D point cloud while the vegetation consists of the leafy portion including canopy. In order to classify these 3D points, they are first over-segmented into 3D voxels and then converted into super-voxels. These are then clustered together using a Link-Chain method to form objects [
42].
The method uses agglomerative clustering to group 3D points based on
r-NN (radius Nearest Neighbor). Although the maximum voxel size is predefined, the actual voxel sizes vary according to the maximum and minimum values of the neighboring points found along each axis to ensure the profile of the structure is maintained as shown in
Figure 8.
A voxel is then transformed into a super-voxel when properties based on its constituting points are assigned to it. These properties mainly include geometrical center, mean hue (H) and saturation (S) values, maximum of the variance of H & S values, mean laser intensity value, variance of laser intensity values, voxel size along each axis X, Y & Z and Surface normals of the constituting 3D points. Here H and S are the values corresponding to the R, G & B (Red, Green and Blue) color values transformed in the HSV (Hue, Saturation and Value) space. As the R, G, B color values are prone to lighting variation (especially in dense forest environments), they are converted into HSV color space for each 3D point. This conversion separates the color component from the intensity component. Also, the intuitiveness of the HSV color space is very useful because we can quantize each axis independently. Wan and Kuo [
43] reported that a color quantization scheme based on HSV color space performed much better than one based on RGB color space. The component, invariant to the lighting conditions, is then analyzed. It is referred to in this paper as the color component as it provides more stable color information. Based on the description presented by Hughes et al. [
44], the following equations were used for the conversion.
Here , , and H, S, V are the corresponding point of , in the space. Also, to be noted that the normalized values of are used, i.e., , and so as a result and . In case of , H is undefined, hence it is assumed to be . After the conversion, the color component () is then used in our analysis.
With the assignment of all these properties, a voxel is transformed into a super-voxel. All these properties are then used to cluster these super-voxels into objects using a link chain method. In this method each super-voxel is considered to be a link of a chain. All secondary links attached to each of these principal links are found. In the final step all the principal links are linked together to form a continuous chain removing redundant secondary links in the process (see Algorithm 2). If
is a principal link and
is the
secondary link, each
is linked to
if and only if the following three conditions are fulfilled:
where, for the principal and secondary link super-voxels, respectively:
, are the geometrical centers;
, are the mean H & S values;
, are the mean laser intensity values;
is the color weight equal to the maximum value of the two variances ;
is the laser intensity weight equal to the maximum value of the two variances .
is the distance weight given as
. Here
is the voxel size along
X,
Y &
Z axis, respectively.
is the inter-distance constant (along the three dimensions) added depending upon the density of points and also to overcome measurement errors, holes and occlusions, etc. If the color or the laser intensity values are not available in any particular data set, (
4) or (
5) could be dropped respectively. The more information there is, the better the results, even though the method continues to work.
Algorithm 2 Segmentation. |
input 3D points |
repeatSelect a 3D point for voxelisation Find all the neighboring points to be included in the voxel using r-NN within the specified maximum voxel length Transform the voxel into super-voxel by first finding and then assigning to it all the properties found using PCA, including surface normal
|
until All 3D points are used in a voxel |
repeat |
until All the super-voxels are used |
Link all the principal links to form a chain removing the redundant links in the process |
return Segmented objects |
These clustered objects are then classified using local descriptors and geometrical features into 2 main classes: {Tree trunk, Vegetation}. These mainly include:
Surface normals: The orientation of the surface normals is found essential for distinction between
Tree trunk and
Vegetation as for the first, the surface normals are predominantly (threshold values greater than
) parallel to the X-Y axis (ground plane as seen in
Figure 6d) whereas for the vegetation the surface normals are scattered in all directions.
Color and intensity: Intensity and color are also an important discriminating factor for the two object classes.
Geometrical center and barycenter: The height difference between the geometrical center and the barycenter along with other properties is very useful in distinguishing objects like tree trunk and vegetation. For the tree trunks both are closely aligned (being a symmetric pole-like structure) whereas for the vegetation they can be different depending on the shape of the vegetation.
Geometrical shape: Along with the above mentioned descriptors, geometrical shape plays an important role in classifying objects. In 3D space, tree trunks are always represented as long and thin while vegetation is usually more spread-out, broad and large with height depending upon the type of vegetation (i.e., tree canopy, ground bushes, etc.).
As the object classes are so distinctly different a simple threshold-based method is used as presented by Aijazi et al. [
42] where the values of the comparison thresholds for these features/descriptors are set accordingly. However, they can also be used to train an SVM classifier as described by Aijazi [
45].
Some results of this method are shown in
Figure 6. The salient features of this method are data reduction, efficiency and simplicity of approach. During this process, the ground under the tree trunk is also segmented out and we use it to determine the local ground slope on which the tree is present as explained in the next section.