Assessment of GEDIs LiDAR Data For The Estimation of Canopy Heights and Wood Volume of Eucalyptus Plantations in Brazil
Assessment of GEDIs LiDAR Data For The Estimation of Canopy Heights and Wood Volume of Eucalyptus Plantations in Brazil
Assessment of GEDIs LiDAR Data For The Estimation of Canopy Heights and Wood Volume of Eucalyptus Plantations in Brazil
Abstract—Over the past two decades spaceborne LiDAR systems Index Terms—Brazil, dominant heights, eucalyptus, global
have gained momentum in the remote sensing community with their ecosystem dynamics investigation (GEDI), LiDAR, wood volume.
ability to accurately estimate canopy heights and aboveground
biomass. This article aims at using the most recent global ecosystem
dynamics investigation (GEDI) LiDAR system data to estimate I. INTRODUCTION
the stand-scale dominant heights (H dom ), and stand volume (V)
N THE last couple of decades, global concerns on the
of Eucalyptus plantations in Brazil. These plantations provide a
valuable case study due to the homogenous canopy cover and the
availability of precise field measurements. Several linear and non-
I increased atmospheric concentration of greenhouse gases,
such as CO2 has risen the interest in quantifying the state and
linear regression models were used for the estimation of H dom and
V based on several GEDI metrics. H dom and V estimation results
change of forest resources due to the key role of forests in the
showed that over low-slopped terrain the most accurate estimates global carbon cycle [1], [2]. Forests sequester a large quantity
of H dom and V were obtained using the stepwise regression, with of carbon in their woody biomass where they store around 70%
an root-mean-square error (RMSE) of 1.33 m (R2 of 0.93) and 24.39 to 90% of the global terrestrial biomass ranging from 385x109
m3 .ha−1 (R2 of 0.90) respectively. The principal metric explaining to 650x109 Mg [3]. Hence, the accurate estimation of forest
more than 87% and 84% of the variability (R2 ) of H dom and V
was the metric representing the height above the ground at which
biomass is needed to better determine its precise role in the
90% of the waveform energy occurs. Testing the postprocessed global carbon cycle [4], [5]. Forest plantations represent a small
GEDI metric values issued from six available different processing fraction (6.9%) of the total forested land ([6]) but are becoming
algorithms showed that the accuracy on H dom and V estimates is increasingly important around the world, economically, socially
algorithm dependent, with a 16% observed increase in RMSE on and environmentally ([7], [8]).
both variables using algorithm a5 vs. a1. Finally, the choice to select
the ground return from the last detected mode or the stronger of
The primary source of above ground biomass (AGB) estima-
the last two modes could also affect the Hdom estimation accuracy tion in tropical forests at large scales came in the last years
with 12 cm RMSE decrease using the latter. from observations and measurements from different satellite
remote sensing platforms. Methods based on remotely sensed
Manuscript received November 13, 2020; revised January 11, 2021 and May 5, data are less accurate than field measurements, however, their
2021; accepted June 15, 2021. Date of publication June 28, 2021; date of current
version July 26, 2021. This work was supported in part by the French Space Study major advantages are their global and frequent coverage and the
Center (CNES, TOSCA 2020 project), and in part by the National Research low or free acquisition costs for the end user. Currently optical,
Institute for Agriculture, Food, and the Environment (INRAE). (Corresponding radar, and LiDAR are the three main sources of remotely sensed
author: Ibrahim Fayad.)
Ibrahim Fayad and Nicolas N. Baghdadi are with the French National Re- data used in AGB estimation techniques. Nonetheless, current
search Institute for Agriculture, Food and the Environment (INRAE), CIRAD, data sources are either limited to low AGB levels (<150 Mg/ha)
CNRS, TETIS, AgroParisTech, Université de Montpellier, 34093 Montpellier, (sensor saturation at certain biomass levels with radar and optical
France (e-mail: [email protected]; [email protected]).
Clayton Alcarde Alvares is with the UNESP, Faculdade de Ciências data) or have a limited spatial coverage (e.g., airborne LiDAR
Agronômicas Botucatu 18610-034, Brazil, and also with the Suzano SA, Limeira data). LiDAR systems either airborne or spaceborne have the
13465-970, Brazil (e-mail: [email protected]). capability to capture the horizontal and vertical structure of
Jose Luiz Stape is with the UNESP, Faculdade de Ciências Agronômicas,
Botucatu 18610-034, Brazil (e-mail: [email protected]). vegetation comprehensively [9], and can thus estimate biomass
Jean Stéphane Bailly is with the INRAE, IRD, Institut Agro, LISAH, Univer- with better precision in comparison to the techniques using
sité de Montpellier, 34060 Montpellier, France, and also with the AgroParisTech, radar or optical data [10], [11]. To date, there have been only
75005 Paris, France (e-mail: [email protected]).
Henrique Ferraço Scolforo is with the Suzano SA, Limeira 13465-970, Brazil three satellite LiDAR missions. The first mission was the Ice,
(e-mail: [email protected]). cloud, and land elevation satellite (ICESat-1) which carried
Mehrez Zribi is with the Center for the Study of the Biosphere from the geoscience laser altimeter system (GLAS) from 2003 until
Space (CNRS/UPS/IRD/CNES/INRAE), 31401 Toulouse, France (e-mail:
[email protected]). 2009 [12]. Although GLAS’s ∼60 m diameter footprint was
Guerric Le Maire is with the CIRAD, UMR Eco&Sols, 34398 Montpel- larger than the ideal resolution for forest observations [13], its
lier, France, and also with the Eco&Sols, CIRAD, INRA, IRD, Montpel- capability to estimate forest parameters (e.g., canopy heights
lier SupAgro, Université de Montpellier, 34060 Montpellier, France (e-mail:
[email protected]). and biomass) has been exploited in numerous studies during its
Digital Object Identifier 10.1109/JSTARS.2021.3092836 operational and post-operational periods [5], [14]–[20].
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
7096 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021
ICESat-1 was followed in 2018 by ICESat-2 that carried the Consequently, in such evaluation and comparison of pro-
advanced topographic laser altimeter system (ATLAS) with a cessing algorithms and models for forest height and volume
goal to measure ice-sheet topography, cloud and atmospheric estimation, the field dataset plays a critical role. In this article,
properties and global vegetation. However, the wavelength of the we want to focus on the uncertainty coming from the GEDI
equipped laser (532 nm) has a spectral region of high radiation metrics and models, minimizing the influence of the uncertainty
absorption by the vegetation. This results in a low number of on in situ measurements. To reach this objective, we analyzed
reflected photons measured by ATLAS over vegetation [21], a large dataset of forest plantations in Brazil, which has many
and limits its ability to estimate forest canopy heights [21]. advantages to serve as a test case: large number of sites, dif-
The most recent spaceborne LiDAR system is GEDI on board ferent climate and topographical environments, numerous and
the ISS, which was launched in December 2018 with on-orbit frequent measurements, precise measurements of tree heights,
checkout in April 2019. GEDI’s mission objective is to provide good allometric relationships for wood volume and homoge-
information about canopy structure, biomass and topography, neous canopies, etc. (see description in Section II-C). And even
and is estimated to acquire 10 billion cloud free shots in its two though not representative of all forests, the results obtained on
years mission [22]. GEDI measures vertical structures similar forest plantations can give a notion of the reachable precision
to ICESat-1 (i.e., waveforms). However, given GEDI’s higher on height and wood volume estimation using GEDI data on a
sampling rate (242 versus 40 Hz for ICESat-1), and the much more structurally simple forest than natural forests, while also
smaller footprint size (∼25 versus ∼60 m for ICESat-1), GEDI removing part of the errors due to in situ measurements.
provides a highly improved coverage and waveform precision. The main objectives of this article are therefore summarized
GEDI’s ability to estimate forest height and wood volume in the following questions.
on different types of forest ecosystems, topography and lati- 1) What are the more important GEDI metrics linked to
tudes is of paramount importance. GEDI datasets are organized canopy height and volume?
in different levels of products, from raw acquisition data to 2) Are linear and nonlinear models using subsets of metrics
more elaborated data obtained by performing signal analysis more efficient in predicting height and volume?
and metrics extraction from the waveforms. This results in 3) What is the importance of the different pre-processing
a large number of metrics for each acquired footprint, from algorithms on the final uncertainties?
which, many different models could be used to retrieve canopy 4) Is there an influence of other acquisition characteristics,
heights and wood volume. While direct metrics could be used such as viewing angle on the estimated forest characteris-
as good proxies, it is however acknowledged that combining tics?
different metrics yields higher accuracies. For instance, such 5) Are other stand information, such as data from DEM or
algorithms make use of linear or non-linear regression models age of the stand relevant for the estimation of height and
applied on sets of metrics extracted from GEDI waveforms, volume on forest plantations?
and eventually combined with digital elevation models (DEMs). The manuscript presents first the GEDI dataset, followed by
The full waveform LiDAR data can potentially give access to the processing of GEDI data and the main metrics that will be
more information on canopy structure than the basic “top” and used for the estimation of canopy heights and wood volume.
“bottom” return signals, being itself potentially informative for Next, a description of the used methods for the estimation of
canopy height and volume prediction. Therefore, it is critical the forest characteristics is presented in Section II. Finally,
to explore which metrics, or combination of metrics, and with the results, discussions, and main conclusions are presented in
which type of models (e.g., linear versus nonlinear) provide the Sections IV, V, and VI, respectively
best forest parameter estimates. It is also important to evaluate
the effect of the uncertainty of the metrics estimation themselves, II. STUDY SITE AND DATASETS
which results from differences in preprocessing algorithms, as
well as other acquisition characteristics that may influence the A. Study Area
final models, such as beam acquisition angles. The study area is located in four regions in Brazil, (Bahia &
Precise evaluation of forest height and volume is not an easy Espírito Santo, Mato Grosso do Sul, São Paulo, and Maranhão)
task. One of the main issues is that uncertainties in field mea- across a large latitudinal gradient (see Fig. 1) and covering
surements can propagate through the models and create larger different climate and soil types. The studied plantations are man-
uncertainties in the estimates [23]. For example, Saarela et al. aged in order to produce high yield pulpwood growing at short
[24] and Holm et al. [25] found that not accounting for errors rotations. Clonal seedlings of mainly E. grandis (W. Hill) and E.
in field measurements could underestimate the uncertainty in urophylla (S.T. Blake) and different types of hybrids are planted
final satellite-based AGB maps by a factor of three or more. in rows at a density of 1000–1667 trees/ha, rationally fertilized
Feldpausch et al. [26] and Kearsley et al. [27] found that uncer- with nitrogen, phosphorus, and potassium and micronutrients
tainties in tree height measurements led to increased bias in the to alleviate any nutritional limitations. Harvest occurs every six
biomass and carbon stock estimates. Other obstacles include: to seven years, and very little tree mortality (under 7% from
the influence of tree growth during the timespan between the original plantation) is noticed. The annual productivity of the
field measurements and satellite acquisitions which cannot be plantations was on average 40 m3 /ha/year, with 80% of the
neglected [28]; the comprehensive model validation limited by stands being between 30–50 m3 /ha/year and some stands could
the sparsity of in situ data [29]; and the method used to measure reach values as high as 60 m3 /ha/year. At harvest time, the stand
tree heights [30]. volume is between 180 and 300 m3 /ha, with a dominant height
FAYAD et al.: ASSESSMENT OF GEDI’S LIDAR DATA 7097
Fig. 1. (a) Location of the four study sites. (b) Example of GEDI tracks over some stands. (c) Eucalyptus stand during harvest (approx. 30 m high) illustrating
the clearly separated crown and trunk strata.
of 20 to 35 m range (for 80% of the stands). These plantations the echoed waveforms are digitized to a maximum of 1246 bins
were managed locally by stand units, generally around 50 ha, with a vertical resolution of 1 ns (15 cm), corresponding to a
where the same management is applied: planting, harvesting, maximum of 186.9 m of height ranges, with a vertical accuracy
weed control, genetic material, soil preparation and fertilization. over relatively flat, non-vegetated surfaces of ∼3 cm [31].
There are generally sparse understory and herbaceous strata in As described in the algorithm theoretical basis document
these plantations, as result of chemical weeding the first year, (ATBD) [32], [33], the received waveforms are first smoothed to
the closing of the canopy, and the high competitive strength of reduce the noise in the signal, and thus permitting the determina-
Eucalyptus. Tree height is very homogeneous within a stand, tion of the useful part of the waveform within the corresponding
with 95% of the trees having heights at +/- 10.5% around the footprint. Waveform smoothing is performed by means of a
average tree height in plot inventories. The plantations exhibit Gaussian filter with various widths. As mentioned in the ATBD,
a simple structure, with a tree crown strata of 3 to 10 m in currently a width of 6.5 ns was used for the Gaussian filter
width above a “trunk strata” with few Eucalyptus leaves and (Smooth width). After smoothing, two locations in the waveform
few understories [see Fig. 1(c)]. The “soil strata” is mainly denoted as search start and search end are determined [see
constituted of litter accumulation of branches and leaves, with Fig. 2(a)]. search start and search end are, respectively, the first
some patches of herbaceous species. and last positions in the signal where the signal intensity is above
the following threshold:
B. GEDI Data threshold = mean + σ. v (1)
1) Processing of GEDI Waveforms: GEDI uses three on- where “mean” is the mean noise level, “σ” is the standard
board lasers that produce eight parallel tracks of observations. deviation of noise of the smoothed waveform, and “v” is a
GEDI lasers illuminate a surface or footprint on the ground with constant currently set at 4. After determining the locations of
a 25 m diameter, at a frequency of 242 Hz, over which 3D search start and search end, the region between them, denoted
structures are measured. The footprints are separated by ∼60 as the waveform extent, is extended by a predetermined number
m (center to center) along the track, and the tracks are separated of sample bins, currently set to 100 bins at both sides. Within
by ∼600 m across. Moreover, GEDI has the ability to rotate the waveform extent, the highest (toploc) and lowest (botloc)
up to six degrees, allowing the lasers to be pointed as much detectable returns are determined [see Fig. 2(a)]. The metrics
as 40 km on either side of the ISS’s ground track [22]. GEDI toploc and botloc respectively represent the highest and low-
measures vertical structures using a 1064-nm laser pulse, and est locations within the waveform extent where two adjacent
7098 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021
Fig. 2. (a) Example of an acquired GEDI waveform (Rw) over a Eucalyptus stand (Hdom = 25.9m; V= 230.7 m3 .ha−1 ), its smoothing (Sw) and corresponding
waveform metrics. (b) Cumulative energy of the waveform (CE) between botloc and toploc and the corresponding relative heights (RHn ) at different percentages
“n” for the same waveform. One (1) ns corresponds to 15 cm sampling distance in the waveform. The waveform amplitudes are counts from the analog to digital
converter on the instrument.
intensities are above a threshold. The threshold equation used Finally, the position of the ground return within the waveform
to determine toploc and botloc is the same as (1), with “v” is determined using the position of the last detected mode. The
an integer fixed at 2, 3, 4, or 6 (depending on the processing six different algorithms, noted a1 to a6 correspond to different
configuration). In the ATBD, the value of “v” used to determine values of the above-mentioned parameters (see Table I) and lead
toploc is named “Front_threshold” and “back_threshold” for to different estimates of the waveform metrics, and could in turn
botloc. Waveform metric values are extracted using thresholds lead to six different canopy height estimates. Over forest stands,
on Smoothwidth_zcross, front_threshold, and back_threshold. the recorded waveforms are multimodal in shape, with each
Currently, there are six configurations (henceforth referred to mode representing a reflection from a distinct surface height.
as algorithms) of different thresholds on these variables, which Fig. 2(a) shows a typical waveform over a Eucalyptus forest
are used to determine waveform metrics with high precision stand on relatively flat terrain. Over flat terrain, the first Gaussian
for a variety of acquisition scenarios (see Table I). Finally, the corresponds to a reflection from the top of the canopy while the
location of the distinctive peaks or modes in the waveform, such last Gaussian mostly refers to the lowest point in the footprint,
as the ground peak, or top of canopy peaks is determined using i.e., the ground surface.
a second Gaussian filtering of the waveform section between GEDI data used in this article have been already processed
toploc and botloc, and then finding all the zero crossings of and published by the land processes distributed active archive
the first derivative of the filtered waveform [see Fig. 2(a)]. The center (LP DAAC). Currently, three products (L1B, L2A, and
width of the second Gaussian filter (“Smoothwidth_zcross”) L2B) are available for download. The L1B data product [32]
is fixed to either 3.5 or 6.5 ns (based on the algorithm used). contains detailed information about the transmitted and received
FAYAD et al.: ASSESSMENT OF GEDI’S LIDAR DATA 7099
TABLE I
DIFFERENT THRESHOLDS USED IN EACH OF THE SIX ALGORITHMS FOR THE ANALYSIS OF THE RECEIVED WAVEFORMS
TABLE II
LIST OF ALL THE VARIABLES CALCULATED FROM GEDI WAVEFORMS
Variables to be used as predictor variables in the canopy height and wood volume estimation models are highlighted in gray.
TABLE III
MEAN AND STANDARD DEVIATION OF SOME GEDI METRIC VALUES FROM EACH OF THE SIX PROCESSING ALGORITHMS
USING ALL GEDI SHOTS OVER THE 566 SELECTED EUCALYPTUS PLANTATIONS
C. Inventory Measurements
A total of 566 Eucalyptus stands were selected, corresponding
to stands where GEDI footprints acquired between April 20,
in situ Hdom and in situ V represent the 95th percentile of in situ values for each site. 2019 and September 4, 2019 were totally included. An additional
50 m internal buffer strip from the stand borders was used
to account for any footprint geolocation errors and to avoid
2) Waveforms with a difference between waveform extent footprints that match the boundary between the stand of interest
(Wext ) and (Gloc–Vloc) higher than 400 bins (correspond- and the surrounding medium. These 566 Eucalyptus stands were
ing to 60 m) also selected because they had field inventories performed by the
A total of 6166 footprints were acquired over our refer- company close to GEDI’s acquisition date (time difference fewer
ence stands between April 2019 and September 2019, with than two months). Field inventories are performed on several
the majority of these footprints (92.15%) providing exploitable permanent inventory plots within each stand. These inventory
waveforms. Table IV gives the distribution of GEDI shots across plots are systematically distributed throughout the stand with a
the four regions. density of one plot per 10 ha (i.e., a 20 ha stand will have two
GEDI data accessible through NASA’s LP DAAC contain a inventory plots while an 80 ha plot will contain eight inventory
quality flag (quality_flag) for each acquired waveform. A wave- plots). These permanent inventory plots had each an area of
form with a quality flag set to “1” indicates that the waveform approximately 400 m2 including 30 to 100 trees (average of 58
FAYAD et al.: ASSESSMENT OF GEDI’S LIDAR DATA 7101
statistical approaches have been developed and used in several effect) between the explanatory variables. For this article, the
studies to predict canopy heights from GLAS data (e.g., [5], [15], number of trees in the RF were set to 100 trees (higher tree count
[35], [38], [39]). These approaches proposed regression models slightly increased model accuracy), with a tree depth equal to
based either on only waveform metrics or on both waveform the square root of the number of available factors.
metrics and terrain information derived from DEMs. Finally, since random forests are nonlinear and nonparamet-
The first statistical model was developed by Lefsky et al. [5] ric, we only used the original relative heights without modifica-
to estimate the maximum canopy height (Hdom ) from GLAS tion (i.e., RH1n , 10% ≤ n ≤ 100%, step 10%.)
waveforms
Hdom = aWext − bTI . (3) B. Wood Volume Estimation
The coefficients a and b are fitted using least squares regres- The estimation of aboveground biomass has been proven to be
sion (Hdom given by inventory measurements, Wext is derived successful using ICESat-1 GLAS waveforms as demonstrated
from the GEDI waveform, and TI is calculated from the SRTM by several studies ([15]–[17]). In this article, four models were
DEM, see Section II-D). For our dataset, TI values calculated tested to estimate wood volume from GEDI waveforms based
from the SRTM DEM ranged from 1 to 46 m. The incorporation on Hdom estimates. The first model was adapted from Lefsky
by Payn et al. [6] of the waveform leading edge extent in (4) et al. [5] for the estimation of wood volume (instead of AGB
showed a slight improvement on canopy height estimation in its original formulation), using the squared dominant canopy
heights (Hdom )
Hdom = aWext − bTI + cLeadext . (4)
2
Over sloping terrain, Lefsky et al. [38] observed that the V = a + bHdom . (9)
waveform extent is insufficient for estimating canopy heights.
The second tested model was adapted from Saatchi et al.
Hence, a new model based on the waveform extent, leading edge
[41], and uses a power law relationship between the volume
extent, and trailing edge extent was proposed. However, Pang
and Lorey’s height
et al. [39] observed inaccurate estimates of canopy heights with
the improved model by Lefsky et al. [38], especially for small V = aHL b (10)
waveform extents, and thus proposed a simpler model to estimate
canopy heights using the following equation: where HL is Lorey’s height which weighs the contribution of
c trees (all trees >10 cm in diameter) to the stand height by
Hdom = aWext − {b (Leadext + Trailext )} . (5)
their basal area. In this article, the relationship defined in (10)
The nonlinear model by Pang et al. [39] was further simplified was used by replacing Lorey’s height with the dominant height
by Chen [15] as both height values were similar (HL was lower than Hdom
Hdom = aWext − b (Leadext +Trailext ) (6) by a maximum of 0.9 m at the end of the rotation of the
Eucalyptus plantation) [16]. For both models (9 and 10), the
Baghdadi et al. [16] tested additional models for the estima- coefficients a and b were first fitted using in situ measurements
tion of canopy heights using ICESat-1 GLAS waveforms, of of dominant height and wood volume (see Fig. 6), and then, the
which, two will be tested in this article. The first model uses the calibrated equations were used to estimate wood volume using
Trailext and TI the dominant height predicted from GEDI footprints (best model
Hdom = aWext − bTI + cTrailext . (7) from Section III-A).
Similarly to Section III-A, a stepwise linear regression model
The second model uses exclusively GEDI metrics (SRV) and a random forest regressor (RFV) were used to esti-
Hdom = aWext − bLeadext − cTrailext + d. (8) mate the wood volume.
Fig. 6. Comparison of measured vs. estimated Hdom from the models presented in Section III-A using GEDI metrics extracted with algorithm a1 (see Table I).
RMSE is expressed in meters (m).
are defined as follows: (last detected nonnoise mode). The estimation of the canopy
n dominant heights (Hdom ) using the linear regression models
(yi − yi )2
R = 1 − in= 1
2
2 (11) [(3) through 8] with five-fold cross validation shows an accu-
i = 1 (yi − ȳ) racy (RMSE) between 1.70 and 2.31 m with a coefficient of
determination (R2 ) between 0.80 and 0.89 (see Fig. 6). More-
n
1 over, the contribution of the trailing edge extent appeared to
RMSE = · (yi − yi )2 (12)
n i=1 be higher than that of the leading edge extent [see (7) versus
(4), Table V]. However, the best model between (3) through
n 2 (8) was (8) (RMSE = 1.70 m and R2 = 0.89) which uses
1 yi − yi
RMSPE = 100 · · (13) both Leading and Trailing edge extents, with an independent
n i=1 yi coefficient fitted for each variable. The introduction of terrain
information in the linear regression models did not show any
where yi is the observed value, yi the estimated value, ȳ is the significant improvements on the accuracy of the estimations.
mean of all the observed values, and n is the sample size. The stepwise linear regression model (see Fig. 6, SRH)
The AIC proposed by Akaike [42] is a measurement of the showed slightly better accuracy for the estimation of canopy
relative goodness of fit of a statistical model to the true values. heights (RMSE = 1.44 m, R2 = 0.93) in comparison to Eq.8.
By calculating AIC values for each model, the most performant However, unlike Eq.8 which relied on Wext , Leadext , and
model based on the lowest AIC values can be identified. Trailext , the most contributing variables for the estimation of
the canopy heights using the SRH model were RH90 , followed
IV. RESULTS by RH10 , RH80 , and RH100 . Meanwhile, the other metrics (e.g.,
Leadext , Trailext , TI, etc.,) were not necessary.
A. Canopy Height Estimation Furthermore the estimation of canopy heights using only
We start our model performance analysis using GEDI met- RH90 (by linear fitting) showed an RMSE of 1.63 m with an
rics extracted from algorithm a1 (see Table I), and the ground R2 of 0.90, and this accuracy could be improved to an RMSE
location as determined from the SM field from the L2A dataset of 1.5 m (R2 of 0.91) by only adding RH1.8 10 . The estimation of
7104 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021
TABLE V
MODELS’ PERFORMANCE AND THE FITTED LINEAR EQUATIONS FOR ESTIMATING EUCALYPTUS STAND DOMINANT HEIGHTS
The variables are described in Section II-B-2, with the models described in Section III-A
Fig. 7. Classification of the variable importance by decreasing order of importance in the RFH model for stand dominant height estimation. The importance is
measured via the average percentage increase of MSE (%IncMSE) over 50 repetitions. The red bars indicates the standard deviation of %IncMSE.
TABLE VI
ACCURACY (RMSE IN M) OF THE MODELS PRESENTED IN SECTION III-A FOR THE ESTIMATION OF Hdom USING GEDI METRIC VALUES
EXTRACTED USING THE SIX DIFFERENT ALGORITHMS (A1 THROUGH A6)
Hdom using the random forest regressor (RFH, Fig. 6) with the The estimation of Hdom using the models described previ-
GEDI metrics in Table II (p in RHpn was set to 1 for RFH) as the ously with GEDI metrics extracted from the five remaining
dependent variables showed an accuracy on the canopy height algorithms (a2 to a6, see Table I) has been also tested. The results
estimates similar to that of the SRH model. presented in Table VI show that for the linear regression models
The variable importance test of the metrics (see Fig. 7) used in [(3) through (8)], canopy height estimation was worst with the
RFH showed that the most contributing factors for the estimation metrics from algorithms a2 through a6 in comparison to the
of GEDI canopy heights is a combination of RH90 , RH100 , and metrics from algorithm a1 with an RMSE on the canopy height
to lesser extent RH80 . These results show that in a low relief area estimates ranging from 2.40 m (R2 of 0.78, a3) to 5.06 m (R2 of
the use of other metrics in addition to the RH90 only slightly 0.02, a5). This is to be expected given the low terrain relief in
improved the precision of the estimation of canopy heights. our study area (mean slope of 4.7 ± 3%).
FAYAD et al.: ASSESSMENT OF GEDI’S LIDAR DATA 7105
Fig. 8. Comparison between Measured Hdom and estimated Hdom using only Wext values (Hdom = α.Wext + β) from the six algorithms (a1 through a6).
RMSE is expressed in meters (m).
The low accuracy obtained with algorithm a5 is due to the low TABLE VII
DIFFERENCE IN ACCURACY (RMSE IN M) ON Hdom BASED ON THE CHOICE
thresholds used for the front and back thresholds (3.σ and 2.σ, OF THE SELECTED GROUND MODE FOR THE DIFFERENT MODELS DESCRIBED
Table I), which result in larger waveform extents. This is evident IN SECTION III-A, AND METRICS EXTRACTED USING ALGORITHM A1
when trying to estimate canopy heights based solely on the
waveform extent ( Hdom/insitu = α.Wext + β), with the results
in Fig. 8 showing that the metrics extracted using algorithm a5,
especially the waveform extent (Wext ), were the least correlated
to Hdom , with an RMSE of 4.38 m (R2 of 0.26).
In contrast to the linear regression models, canopy height
estimation using SRH or RFH with metrics from algorithms
a2, a3, a5, and a6 showed accuracies similar to those obtained
with algorithm 1 (see Table VI). In contrast, algorithm a5 was
slightly less accurate with an RMSE of respectively 1.6 m (R2
of 0.90) and 1.80 m (R2 of 0.88) when using SRH and RFH.
Finally, the effects of the method to select the ground return
has been studied. The results presented thus far have been based
on detecting the ground mode from the SM provided in the L2A
data product. SM detects the ground return as being the lowest,
SM = Ground mode from SM (last detected mode)
nonnoisy mode, which usually refers to the last detected mode. provided in the L2A data product. HL2M = ground
Previous studies that used GLAS waveforms suggested that the mode corresponding to the higher amplitude between
mode with the higher amplitude between the last two modes the last two modes.
TABLE VIII
MODELS’ PERFORMANCE AND THE FITTED LINEAR EQUATIONS FOR ESTIMATING EUCALYPTUS STAND WOOD VOLUME (V)
The variables are described in section II.B.2, with the models described in section III.B
TABLE IX
ACCURACY (RMSE IN M3 .HA−1 ) OF THE MODELS PRESENTED IN SECTION III-B FOR THE ESTIMATION OF V USING GEDI
METRIC VALUES EXTRACTED USING THE SIX DIFFERENT ALGORITHMS (A1 THROUGH A6)
V. DISCUSSION
The different tested models in this article showed that GEDI
waveform metrics could be used to obtain good accuracies of
canopy heights and wood volumes, with a RMSPE of 7.1% on
Fig. 9. Comparison of measured vs. estimated wood volume from the models
presented in section III.B using GEDI metrics extracted with algorithm a1.
canopy height estimation and 20.4% on wood volume estima-
RMSE is expressed in m3 .ha−1 . tion. Moreover, GEDI waveforms appear to be of high quality
given the very little variability on the estimation of Hdom and V
from the individual footprints within a given stand. Indeed, the
the four models, the relative RMSE increased from ∼18% for V accuracy (RMSE) on the estimation of Hdom using the mean
less than 250 m3 .ha−1 to ∼40% for V higher than 250 m3 .ha−1 . estimates from SRH of the individual footprints was 1.33 m
The bias (mean difference of in situ V and estimated V) for V (R2 of 0.93) versus 1.32 m (R2 of 0.93) when averaging Hdom
higher than 250 m3 .ha−1 was also more apparent, and decreased estimates from GEDI for each stand. Similarly, the accuracy
from 2.2 m3 .ha−1 (average bias from all models) for V less than on the estimates of V using the mean estimates from SRV was
250 m3 .ha−1 to 26.5 m3 .ha−1 for V higher than 250 m3 .ha−1 . 24.39 m3 .ha−1 (R2 of 0.90) and 23.93 m3 .ha−1 (R2 of 0.91) for
The variable importance test of the GEDI metrics (see Fig. 10) the average of V from GEDI over each stand.
showed that the three most contributing factors on the estimation The most important GEDI variable for the estimation of Hdom
of V using the random forest regressor (RFV) were the same and V is RH90 , which explained respectively more than 87%
as those for the estimation of canopy heights, with the highest and 84% of the variability of Hdom and V. Some of the remaining
contributor being RH90 , followed by RH80 and RH100 . variability are explained by different GEDI metrics based on the
FAYAD et al.: ASSESSMENT OF GEDI’S LIDAR DATA 7107
Fig. 10. Classification of the variable importance by decreasing order of importance in the RFV model for stand wood volume estimation. The importance is
measured via the average percentage increase of MSE (%IncMSE) over 50 repetitions. The red bars indicates the standard deviation of %IncMSE.
AUTHOR CONTRIBUTIONS [15] Q. Chen, “Retrieving vegetation height of forests and woodlands over
mountainous areas in the pacific coast region using satellite laser altime-
Ibrahim Fayad–Conceptualization, Methodology, software, try,” Remote Sens. Environ., vol. 114, no. 7, pp. 1610–1627, Jul. 2010.
validation, formal analysis, data curation, visualization, writing– [16] N. Baghdadi et al., “Testing different methods of forest height and
aboveground biomass estimations from ICESat/GLAS data in eucalyptus
original draft. Nicolas Baghdadi—conceptualization, method- plantations in Brazil,” IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens.,
ology, validation, formal analysis, data curation, writing— vol. 7, no. 1, pp. 290–299, Jan. 2014, doi: 10.1109/JSTARS.2013.2261978.
original draft. Clayton Alcarde Alvares—Conceptualization, [17] J. Boudreau, R. Nelson, H. Margolis, A. Beaudoin, L. Guindon, and
D. Kimes, “Regional aboveground forest biomass using airborne and
validation, writing—review and editing. Jose Luiz Stape– spaceborne LiDAR in Québec,” Remote Sens. Environ., vol. 112, no. 10,
Conceptualization, validation, writing - review and editing. pp. 3876–3890, Oct. 2008.
Jean Stéphane Bailly—Validation, writing - review and edit- [18] M. El Hajj, N. Baghdadi, N. Labrière, J.-S. Bailly, and L. Villard, “Mapping
of aboveground biomass in Gabon,” Comptes Rendus Geosci., vol. 351,
ing. Henrique Ferraço Scolforo—Conceptualization, validation, no. 4, pp. 321–331, Apr. 2019.
writing—review and editing. Mehrez Zribi—Validation, writing [19] M. R. Pourrahmati et al., “Capability of GLAS/ICESat data to estimate
- review and editing. Guerric Le Maire—Conceptualization, forest canopy height and volume in mountainous forests of Iran,” IEEE J.
Sel. Top. Appl. Earth Observ. Remote Sens., vol. 8, no. 11, pp. 5246–5261,
validation, writing—review and editing. Nov. 2015, doi: 10.1109/JSTARS.2015.2478478.
[20] M. R. Pourrahmati et al., “Mapping Lorey’s height over Hyrcanian forests
ACKNOWLEDGMENT of Iran using synergy of ICESat/GLAS and optical images,” Eur. J. Remote
Sens., vol. 51, no. 1, pp. 100–115, Jan. 2018.
The authors would like to thank the GEDI team and the NASA [21] A. Neuenschwander and K. Pitts, “The ATL08 land and vegetation product
for the ICESat-2 mission,” Remote Sens. Environ., vol. 221, pp. 247–259,
LPDAAC (Land Processes Distributed Active Archive Center) Feb. 2019.
for providing GEDI data. The authors acknowledge Suzano´s [22] R. Dubayah et al., “The global ecosystem dynamics investigation: High-
researchers Italo Ramos Cegatta, Renan Tarenta Meirelles Brasil resolution laser ranging of the earth’s forests and topography,” Sci. Remote
Sens., vol. 1, Jun. 2020, Art. no. 100002.
and Carla Foster Feria for their technnical support and the [23] A. Persson, J. Holmgren, and U. Soderman, “Detecting and measuring in-
CIRAD Suzano project. Suzano SA Company supported the dividual trees using an airborne laser scanner,” Photogramm. Eng. Remote
forest-field data collection. Sens., vol. 68, no. 9, pp. 925–932, 2002.
[24] S. Saarela et al., “Hierarchical model-based inference for forest inventory
utilizing three sources of information,” Ann. Forest Sci., vol. 73, no. 4,
REFERENCES pp. 895–910, Dec. 2016.
[25] S. Holm, R. Nelson, and G. Ståhl, “Hybrid three-phase estimators for
[1] M. Main-Knorn et al., “Monitoring coniferous forest biomass change using
large-area forest inventory using ground plots, airborne LiDAR, and space
a Landsat trajectory-based approach,” Remote Sens. Environ., vol. 139,
LiDAR,” Remote Sens. Environ., vol. 197, pp. 85–97, Aug. 2017.
pp. 277–290, Dec. 2013. [26] T. R. Feldpausch et al., “Tree height integrated into pantropical for-
[2] A. Peregon and Y. Yamagata, “The use of ALOS/PALSAR backscatter to
est biomass estimates,” Biogeosciences, vol. 9, no. 8, pp. 3381–3403,
estimate above-ground forest biomass: A case study in Western Siberia,”
Aug. 2012.
Remote Sens. Environ., vol. 137, pp. 139–146, Oct. 2013.
[27] E. Kearsley et al., “Conventional tree height–diameter relationships sig-
[3] R. Houghton, F. Hall, and S. J. Goetz, “Importance of biomass in the global
nificantly overestimate aboveground carbon stocks in the central Congo
carbon cycle,” J. Geophys. Res., Biogeosci., vol. 114, no. G2, 2009.
Basin,” Nat. Commun., vol. 4, no. 1, Oct. 2013.
[4] T. E. Fatoyinbo and M. Simard, “Height and biomass of mangroves in
[28] Y. Su, Q. Ma, and Q. Guo, “Fine-resolution forest tree height estimation
Africa from ICESat/GLAS and SRTM,” Int. J. Remote Sens., vol. 34, across the Sierra Nevada through the integration of spaceborne LiDAR,
no. 2, pp. 668–681, Jan. 2013.
airborne LiDAR, and optical imagery,” Int. J. Digit. Earth, vol. 10, no. 3,
[5] M. A. Lefsky et al., “Estimates of forest canopy height and aboveground
pp. 307–323, Mar. 2017.
biomass using ICESat: ICESAT Estimates of Canopy Height,” Geophys.
[29] H. Tang et al., “Deriving and validating leaf area index (LAI) at multiple
Res. Lett., vol. 32, no. 22, Nov. 2005. spatial scales through lidar remote sensing: A case study in Sierra National
[6] T. Payn et al., “Changes in planted forests and future global implications,”
Forest, CA,” Remote Sens. Environ., vol. 143, pp. 131–141, Mar. 2014.
Forest Ecol. Manage., vol. 352, pp. 57–67, 2015.
[30] Y. Wang et al., “Is field-measured tree height as reliable as believed –
[7] P. Elias and D. Boucher, Planting for the Future. How Demand for Wood
A comparison study of tree height estimates from field measurement,
Products Could Be Friendly to Tropical Forests. Cambridge, MA, USA: airborne laser scanning and terrestrial laser scanning in a Boreal forest,”
Union Concerned Scientists, 2014.
ISPRS J. Photogramm. Remote Sens., vol. 147, pp. 132–145, Jan. 2019.
[8] R. Pirard, L. Dal Secco, and R. Warman, “Do timber plantations con-
[31] R. Dubayah et al., “The global ecosystem dynamics investigation: High-
tribute to forest conservation?,” Environ. Sci. Policy, vol. 57, pp. 122–130,
resolution laser ranging of the earth’s forests and topography,” Sci. Remote
Mar. 2016.
Sens., vol. 1, Jun. 2020, Art. no. 100002.
[9] M. A. Lefsky, W. B. Cohen, G. G. Parker, and D. J. Harding, “Lidar remote
[32] S. L. R. Dubayah, “GEDI L1B geolocated waveform data global footprint
sensing for ecosystem studies,” Bio. Sci., vol. 52, no. 1, pp. 19, 2002.
level V001,” NASA EOSDIS Land Processes DAAC, 2020. Accessed: Jul.
[10] R. Nelson, K. J. Ranson, G. Sun, D. S. Kimes, V. Kharuk, and P. Montesano, 2021. [Online]. Avilable: https://doi.org/10.5067/GEDI/GEDI01_B.001
“Estimating Siberian timber volume using MODIS and ICESat/GLAS,”
[33] S. L. R. Dubayah, “GEDI L2A elevation and height metrics data global
Remote Sens. Environ., vol. 113, no. 3, pp. 691–701, Mar. 2009.
footprint level V001,” NASA EOSDIS Land Processes DAAC, 2020.
[11] R. O. Dubayah et al., “Estimation of tropical forest height and biomass
Accessed: Jul. 2021. [Online]. Avilable: https://doi.org/10.5067/GEDI/
dynamics using Lidar remote sensing at La Selva, Costa Rica: Forest GEDI02_A.001
dynamics using LiDAR,” J. Geophys. Res., vol. 115, Jun. 2010.
[34] S. L. R. Dubayah, “GEDI L2B canopy cover and vertical profile metrics
[12] B. E. Schutz, H. J. Zwally, C. A. Shuman, D. Hancock, and J. P. DiMarzio,
data global footprint level V001,” NASA EOSDIS Land Processes DAAC,
“Overview of the ICESat mission,” Geophys. Res. Lett., vol. 32, no. 21,
2020. Accessed: Jul. 2021. [Online]. Avilable: https://doi: 10.5067/GEDI/
2005, Art. no. L21S01.. GEDI02_B.001
[13] Y. Pang, M. Lefsky, G. Sun, and J. Ranson, “Impact of footprint di-
[35] C. Hilbert and C. Schmullius, “Influence of surface topography on ICE-
ameter and off-nadir pointing on the precision of canopy height esti-
Sat/GLAS forest height estimation and waveform shape,” Remote Sens.,
mates from spaceborne LiDAR,” Remote Sens. Environ., vol. 115, no. 11,
vol. 4, no. 8, pp. 2210–2235, Jul. 2012.
pp. 2798–2809, Nov. 2011. [36] T. J. Urban, B. E. Schutz, and A. L. Neuenschwander, “A survey of
[14] I. Fayad et al., “Aboveground biomass mapping in French Guiana by
ICESat coastal altimetry applications: Continental coast, open ocean is-
combining remote sensing, forest inventories and environmental data,”
land, and inland river,” Terr. Atmospheric Ocean. Sci., vol. 19, pp. 1–19,
Int. J. Appl. Earth Observ. Geoinf., vol. 52, pp. 502–514, Oct. 2016.
2008.
7110 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 14, 2021
[37] D. J. Harding, “ICESat waveform measurements of within-footprint to- Jose Luiz Stape received the Ph.D. degree in forest
pographic relief and vegetation vertical structure,” Geophys. Res. Lett., ecology from Colorado State University, Fort Collins,
vol. 32, no. 21, 2005, Art. no. L21S10. CO, USA, in 2002.
[38] M. A. Lefsky, M. Keller, Y. Pang, P. B. De Camargo, and M. O. Hunter, He is a Permanent Graduate Professor of Forest
“Revised method for forest canopy height estimation from geoscience laser Ecophysiology with Sao Paulo State University (UN-
altimeter system waveforms,” J. Appl. Remote Sens., vol. 1, no. 1, 2007, ESP, Brazil), Sao Paulo, Brazil. He was with the
Art. no. 013537 University of Sao Paulo, and with the North Car-
[39] Y. Pang, M. Lefsky, H.-E. Andersen, M. E. Miller, and K. Sherrill, olina State University and across many countries and
“Validation of the ICEsat vegetation product using crown-area-weighted companies, looking to improve silvicultural recom-
mean height derived using crown delineation with discrete return lidar mendations for the sustainability of forest plantations
data,” Can. J. Remote Sens., vol. 34, pp. S471–S484, 2008. including: clonal deployment; site-preparation; nutri-
[40] L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, tion and spacing. To better evaluate the factors limiting forest productivity and
2001. controlling C allocation, he coordinated the establishment, with other scientists,
[41] S. S. Saatchi et al., “Benchmark map of forest carbon stocks in tropical of four large Eucalyptus and Pine cooperative research programs in Brazil
regions across three continents,” Proc. Nat. Acad. Sci., vol. 108, no. 24, via IPEF (BEPP, Eucflux, TECHS and PPPIB) and a research network at
pp. 9899–9904, Jun. 2011. Suzano company (G2M2P2). Nowadays the use of remote sensing to improve
[42] H. Akaike, “Information theory and an extension of the maximum like- monitoring, management and modeling planted forests has been a main focus
lihood principle,” in Selected Papers of Hirotugu Akaike, E. Parzen, K. of his research.
Tanabe, and G. Kitagawa, Eds. New York, NY, USA: Springer, 1998,
pp. 199–213.
Jean Stéphane Bailly received the Engineering de-
[43] G. le Maire et al., “MODIS NDVI time-series allow the monitoring of
gree in agronomy, the M.Sc. degree in biostatistics,
eucalyptus plantation biomass,” Remote Sens. Environ., vol. 115, no. 10, and the Ph.D. degree in hydrology from the University
pp. 2613–2625, Oct. 2011.
of Montpellier, Montpellier, France, in 1990, 2003,
and 2007.
He is currently a Senior Lecturer of physical ge-
ography and geostatistics in AgroParis-Tech, Paris,
Ibrahim Fayad received the Engineering degree in France. He is currently a Research Fellow with the
computer and telecommunications in 2011 and the UM LISAHLab, Montpellier, France. His research
Ph.D. degree in automatic and microelectronic sys- interests include spatial observations and modeling
tems both from the University of Montpellier, Mont- for hydrological issues.
pellier, France in 2015.
He is currently a Research Engineer with the Na-
Henrique Ferraço Scolforo received the Ph.D. de-
tional Research Institute for Agriculture, Food and
gree in forest biometrics from the North Carolina
the Environment, Montpellier, France. His research
State University, Raleigh, NC, USA, in 2018.
interests include machine learning for the retrieval of
environmental parameters using remote sensing data. His research focused on growth and yield modeling
sensitive to climate and clonal variation applied to
eucalypt stands in Brazil. Since 2018, he has been
leading the biometrics, inventory, growth, and yield
studies with Suzano SA, São Paulo, Brazil.