Mapping Global Grassland Dynamics 2000-2022 At30m Spatial Resolution Using Spatiotemporalmachine Learning

Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Mapping global grassland dynamics 2000—2022 at

30m spatial resolution using spatiotemporal


Machine Learning
Leandro Parente

OpenGeoHub Foundation https://orcid.org/0000-0003-1589-0467


Lindsey Sloat
World Resources Institute https://orcid.org/0000-0002-2986-9725
Vinicius Mesquita
Remote Sensing and GIS Laboratory (LAPIG/UFG) https://orcid.org/0009-0004-9873-2775
Davide Consoli
OpenGeoHub Foundation https://orcid.org/0000-0003-4007-2896
Radost Stanimirova
World Resources Institute https://orcid.org/0000-0001-9617-5830
Tomislav Hengl
OpenGeoHub Foundation https://orcid.org/0000-0002-9921-5129
Carmelo Bonannella
OpenGeoHub Foundation https://orcid.org/0000-0002-5391-8427
Nathália Teles
Remote Sensing and GIS Laboratory (LAPIG/UFG) https://orcid.org/0000-0002-4265-3080
Ichsani Wheeler
OpenGeoHub Foundation https://orcid.org/0000-0002-9425-9157
Steffen Ehrmann
German Centre for Integrative Biodiversity Research (iDiv)
Maria Hunter
Remote Sensing and GIS Laboratory (LAPIG/UFG) https://orcid.org/0000-0001-6449-9718
Laerte Ferreira
Remote Sensing and GIS Laboratory (LAPIG/UFG) https://orcid.org/0000-0002-0489-1141
Ana Paula Mattos
Remote Sensing and GIS Laboratory (LAPIG/UFG) https://orcid.org/0000-0002-9442-4589
Bernard Oliveira
Remote Sensing and GIS Laboratory (LAPIG/UFG) https://orcid.org/0000-0003-1311-1116
Carsten Meyer
German Centre for Integrative Biodiversity Research (iDiv) https://orcid.org/0000-0003-3927-5856
Murat Şahin
OpenGeoHub Foundation https://orcid.org/0000-0003-4143-1467
Martijn Witjes
OpenGeoHub Foundation https://orcid.org/0000-0002-0962-6478
Steffen Fritz
International Institute for Applied Systems Analysis (IIASA) https://orcid.org/0000-0002-9853-8903
Žiga Malek
International Institute for Applied Systems Analysis (IIASA) https://orcid.org/0000-0002-6981-6708
Fred Stolle
World Resources Institute https://orcid.org/0000-0002-3961-8591

Data Note

Keywords: landsat, modis, random forest, machine learning, grassland, livestock

Posted Date: June 4th, 2024

DOI: https://doi.org/10.21203/rs.3.rs-4514820/v1

License:   This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License

Additional Declarations: The authors declare no competing interests.


1 Mapping global grassland dynamics 2000—2022 at
2 30m spatial resolution using spatiotemporal Machine
3 Learning
4 Leandro Parente1* , Lindsey Sloat2 , Vinicius Mesquita3 , Davide Consoli1 , Radost
5 Stanimirova2 , Tomislav Hengl1 , Carmelo Bonannella1,4 , Nathália Teles3 , Ichsani Wheeler1 ,
6 Steffen Ehrmann5,6,8 , Maria Hunter3 , Laerte Ferreira3 , Ana Paula Mattos3 , Bernard
7 Oliveira3 , Carsten Meyer6,7,8 , Murat Şahin1 , Martijn Witjes1,4 , Steffen Fritz5 , Ziga Malek5 ,
8 and Fred Stolle2

9
1 OpenGeoHub Foundation, Doorwerth, The Netherlands
10
2 Land & Carbon Lab, World Resources Institute, Washington DC, USA
11
3 Remote Sensing and GIS Laboratory (LAPIG/UFG), Goiânia, Brazil
12
4 Laboratory of Geo-Information Science and Remote Sensing, Wageningen University & Research, Wageningen,
13 The Netherlands
14
5 International Institute for Applied Systems Analysis (IIASA), Laxenburg, Austria
15
6 German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany
16
7 Institute of Geosciences and Geography, Martin Luther University Halle-Wittenberg, Halle (Saale), Germany
17
8 Institute of Biology, Leipzig University, Leipzig, Germany
18
* corresponding author(s): Leandro Parente ([email protected])

19 ABSTRACT

The paper describes the production and evaluation of global grassland dynamics mapped annually for 2000-2022 at 30 m
spatial resolution. The dataset showing the spatiotemporal distribution of cultivated and natural/semi-natural grassland classes
was produced by using GLAD Landsat ARD-2 image archive, accompanied by climatic, landform and proximity covariates,
spatiotemporal machine learning (per-class Random Forest) and over 2.3M reference samples (visually interpreted in Very
20
High Resolution imagery). Custom probability thresholds (based on five-fold spatial cross-validation) were used to derive
dominant class maps with balanced precision and recall values, 0.64 and 0.75 for cultivated and natural/semi-natural grassland,
respectively. The produced maps (about 4 TB in size) are available under an open data license as Cloud-Optimized GeoTIFFs
and as Google Earth Engine assets. The suggested uses of data include (1) integration with other compatible land cover
products and (2) tracking the intensity and drivers of conversion of land to cultivated grasslands and from natural / semi-natural
grasslands into other land use systems.

21 Background & Summary


22 Grasslands are among the most vital global ecosystems, and, comprising open grasslands, grassy shrublands, and savannas, they
23 cover approximately 40% of the Earth’s surface [1, 2]. These ecosystems are critical for carbon sequestration, food production,
24 biodiversity maintenance, and cultural heritage for people all over the world [1]. Klein et al. [3] estimate that in 2000, there
25 were 3,322 Mha of pastures in the world, both pastures and croplands experiencing rapid expansion. However, despite their
26 ecological, cultural and socioeconomic importance, no comprehensive time series of high-resolution global maps specifically
27 focused on grasslands yet exists. In addition, more detailed information on grassland management and use is also lacking,
28 particularly at high resolutions and over extended periods of time. Geospatial monitoring for these areas is urgently needed to
29 support conservation efforts, to underpin meaningful corporate supply chain no-conversion commitments, to reduce greenhouse
30 gas emissions from the land sector [4, 5], to aid contribution to positive land use planning, allow finance for nature-based
31 solutions and to contribute to restoring degraded landscapes [1, 2].
32 Grasslands are one of the most challenging classes in land cover monitoring, driven by various natural, anthropogenic,
33 and social aspects that vary between regions and cultures [6]. General-purpose global land cover maps have traditionally
34 mapped classes such as grasslands and shrublands with coarse spatial resolution, such as 500 m for NASA’s Global Land
35 Cover Type [7] and 300 m for ESA’s Climate Change Initiative Land Cover [8]. Other products such as HYDE (10 km) [3],
36 Earthstat (10 km) [9], and HILDA+ (1 km) [10] further differentiate grassland management systems such as pastures/rangelands
37 and unmanaged lands. However, their spatial resolution remains relatively coarse. In addition, the loose class definitions
38 of existing grassland maps significantly hinder interoperability between classification systems. Recently, higher-resolution
39 general-purpose land cover maps have become available by classifying Landsat (30 m) and Sentinel-2 (10 m) Earth Observation
40 (EO) archives [11, 12, 13, 14, 15], improving spatial resolution of grasslands, however have maintained the broad definition
41 for grasslands without incorporating information on how they are actually intended to be used; thus limiting their usability
42 for farmers, national agencies monitoring livestock, and agricultural extension experts. National medium- to high-resolution
43 products [16, 17] successfully add further differentiation to grasslands, but unfortunately cannot be used globally due to their
44 limited spatial coverage.
45 In response to the need for detailed global-scale monitoring products targeting grasslands, the Land & Carbon Lab
46 initiated the Global Pasture Watch (GPW) research consortium, gathering experts from the World Resources Institute (WRI),
47 OpenGeoHub Foundation, the Image Processing and GIS Laboratory at the Federal University of Goiás (LAPIG/UFG), the
48 International Institute for Applied Systems Analysis (IIASA), the German Center for Integrative Biodiversity Research (iDiv),
49 Cornell University; and the Global Land Analysis and Discovery laboratory of the University of Maryland (GLAD). GPW aims
50 to advance grassland monitoring by creating recurrent collections of global mapping products from the year 2000 onward at a
51 suitable spatial resolution (i.e. 30 m) to create fit-for-purpose monitoring solutions which are uniquely designed to be open to
52 incorporating the significantly regional cultural knowledge surrounding grasslands.
53 In this paper, we present a novel data set with annual time series of global cultivated and natural/semi-natural grasslands
54 mapped at 30 m spatial resolution covering the period from 2000 to 2022. We first explain all sampling and modeling steps and
55 then report results of spatial cross-validation and comparison with existing datasets (e.g. GLanCE [18], UMD GLAD GLCLUC
56 [13], GLC_FCS30D [15]). We also visualize the annual values of the dominant class and the probability of grasslands, discuss
57 potential applications, and openly report the limitations and future needs of the data we have produced. The data are available
58 under open license (CC-BY) and will be regularly updated and improved with additional regional contexts, as well as new years
59 added as the EO images become available.

60 Methods
61 Our mapping framework, shown in Fig. 1, was based on multiple Earth Observation (EO) data such as GLAD Landsat ARD-2
62 [19], MOD11A2 [20], MCD19A2 [21], digital terrain model derivatives and distance maps of accessibility, roads, and water.

2/28
63 To train the models, we used more than 2.3M reference samples visually interpreted in Very High Resolution (VHR) images
64 (i.e. Google Maps and Bing Maps). Two independent spatiotemporal machine learning (ML) models [22] were used to predict
65 each grassland class (i.e. cultivated grassland and natural/semi-natural grassland) over multiple years on a global scale. We
66 produced predictions for all years from 2000 to 2022, resulting in a time series of global probability maps for cultivated and
67 natural/semi-natural grassland at 30 m spatial resolution. Both probabilities were used to derive an integrated dominant class of
68 grasslands, considering a custom global threshold per class. The exact methodological steps are described in the following
69 sections.

Figure 1. The GPW grassland mapping framework encompasses general processing workflows, key inputs and outputs, and a
feedback loop to improve future versions of the global maps.

70 But what is grass? Reference sampling design


71 We use a Feature Space Coverage Sampling (FSCS [23]) to generate reference samples. This sampling design helps improve
72 the representativeness of reference samples and is especially suitable for fitting multivariate predictive mapping models [23].

3/28
73 We used FSCS to generate 10,000 sample tiles (i.e. 1×1 km) distributed across the World. We used 87 input layers for FSCS,
74 shown in Table 1, restricted by a short vegetation mask that includes all pixels mapped as mosaic, shrubland, grassland, and
75 sparse vegetation in at least one year from 1993 to 2021 (i.e. 13 land cover classes described in Table S1), according to the
76 ESA/CCI global land cover time-series [24].
77 In practice, the FSCS steps include:

78 1. Principal Components Analysis (PCA) using all input layers,


79 2. Selection of the 10 first components (explaining 75% of variance),
80 3. K-Means with 10,000 clusters (targeted number of samples),
81 4. Calculation of Euclidean distance (in the principal component space) of all 1 km pixels to the centre of each cluster,
82 5. Selection of the pixel with the shortest distance for each cluster,
83 6. Conversion of the selected pixels to sample tiles (1×1 km).

Table 1. Input layers for the Feature Space Coverage Sampling (FSCS). All layers were resampled to 1 km by average and
filtered by a short vegetation mask based on ESA/CCI global land cover maps [24]. The long-term derivatives were calculated
considering the entire time period and a specific month (e.g. all Januaries from 2000 to 2021).

Number
Theme Product Variable Time period
of layers
GLO-90 Copernicus
Terrain Elevation 2011 and 2015 1
Digital Elevation Model [25]
Terrain Geomorpho90m [26] Slope 2018 1
MODIS Long-term median EVI (all months) 12
Vegetation index 2000 to 2021
MOD13Q1 v061 [27] Long-term std. deviation EVI (all months) 12
Long-term median day time LST (all months) 12
Land MODIS Long-term std. day time LST (all months) 12
2000 to 2021
Temperature MOD11A2 v061 [20] Long-term median night-time LST (all months) 12
Long-term std. night time LST (all months) 12
Climate CHELSA time-series [28] Long-term mean precipitation (all months) 1981 to 2018 12
Water JRC Global Surface Water [29] Water occurrence 1984 to 2018 1
Total number of layers 87

84 Reference labeling protocol


85 The selected FSCS tiles were visually interpreted by 16 visual interpretation (VI) analysts who classified the entire tile
86 surface into three classes (i.e. cultivated grassland, natural/semi-natural grassland and other land cover) using Google Maps
87 and Bing Maps imagery as reference. The analysts used a QGIS plugin (https://plugins.qgis.org/plugins/
88 qgis-fgi-plugin) specifically designed to optimize the classification process and evaluated 10,000 tile samples (i.e.
89 1×1 km). For each tile, the plugin automatically created a finer grid (i.e. 10 m grid cells), where each analyst manually assigned
90 a single class and a reference date for a group of grid cells according to base imagery, as shown in Fig. 2. For Google Maps
91 images, the analysts got the reference date from Google Earth software, and for Bing Maps, the plugin retrieved it through the

4/28
92 Bing API. A total of 2,995 tiles were discarded due to a lack of suitable VHR images, predominately occurring in regions with
93 latitudes higher than 60.5 degrees north.

94 Reference labeling criteria


95 In order to initially capture the inherent complexity of grasslands ecosystems, we developed a hierarchical ontology based on
96 [30] (see Table S2) and in line with attempting to separate natural/semi-natural grasslands without significant human directed
97 management, from those under heavy management and/or entirely cultivated grasslands. The reference labelling criteria were
98 by necessity focused only on these two end-member states (i.e. cultivated and natural/semi-natural) taking into consideration
99 features that can be objectively identified in VHR imagery. The reference labelling criteria, shown in Table 2, was used to train
100 all analysts to visually distinguish our mapping classes according to the follow descriptions:

101 • Cultivated grassland includes areas where grasses and other forage plants have been intentionally planted and managed,
102 as well as areas of native grassland-type vegetation where they clearly exhibit active and ’heavy’ management for specific
103 human-directed uses, such as directed grazing of livestock. Many natural/semi-natural landscapes exist on a human
104 intervention gradient, which is assumed by our criteria to initially be indicated by the presence of livestock-related
105 infrastructure such as fencing and watering points. As interventions become more intensive through time, practices such
106 as regular seeding, ploughing, mowing, fertilization, controlled grazing, and sometimes irrigation, aimed at enhancing
107 productivity and maintaining the desired vegetation cover, start to become visible and/or implied by the visual character
108 of the landscape. In general, the nonexclusive criteria applied to this class can be approximated from Table 2,

109 • Natural/semi-natural grassland includes relatively undisturbed native grasslands/short-height vegetation, such as
110 steppes and tundra, as well as areas that have experienced varying degrees of human activity in the past. These grasslands
111 may contain a mix of native and introduced species due to historical land use and natural processes. In general, they
112 exhibit natural-looking patterns of varied vegetation and clearly ordered hydrological relationships throughout the
113 landscape. This class also includes land that may have become degraded due to overuse or mismanagement but is not
114 currently under intensive restoration or active management. Semi-natural areas may still have minimal active management
115 and low-intensity practices such as periodic burning or episodic grazing under human direction to maintain the current
116 grassy state or as part of arid or semi-arid transhumance practices. In general, the nonexclusive criteria applied to this
117 class can be approximated from Table 2,

118 • Other land cover includes all other classes of land cover and land use, including, but not limited to, water bodies, rivers,
119 snow, permanent ice, built-up areas, forest, annual crops (e.g. soybean, maize), perennial crops (e.g. coffee), bare ground,
120 rocky outcrops, and wetlands. The definitions of the criteria may vary according to the types of LULC classes. Generally,
121 we considered everything that does not fit into the other two classes as Other land cover.

122 Our reference labelling criteria were re-evaluated and refined through iterative discussions involving the GPW team, and
123 may be actively fed by external analysts/users bringing additional cultural and regional expert knowledge, systematically
124 contributing for improvements in our grassland reference samples.

125 Reference sample pre-processing and filtering


126 All classified tiles with an assigned reference date were converted to point samples considering a 60 m of spatial support (i.e. two Landsat
127 pixels). For each point sample, we derive a class proportion based on the number of grid cells (i.e. 10 m) for each class. For example, a point
128 sample with 30 grid cells classified as cultivated grassland had a class proportion equal to 0.83 (i.e. 30 divided by 36). Since we implemented

5/28
Table 2. Visual interpretation criteria used in the reference labeling protocol. Short-range variation refers to distances of 10s
to 100s of meters, while long-range variation covers areas beyond 1 km, encompassing a 9 km² landscape context.

Criteria Cultivated grasslands Semi-natural/natural grasslands


Colour & texture variation
Short Colour/texture are geometrically regularised, high Colour/texture variations are pronounced, natural-
range homogeneity indicative of species and/or temporal istic patterning indicating a diversity of vegetation
variation management of vegetation. responding to soil/water variations.
Long Landscapes are unnaturally uniform due to manage- Landscapes & reflect soil/water variations, are or-
range ment activities, disregarding soil/water variations. dered with natural patterning &. plant variation
variation
Seasonal High between field heterogeneity within & between Seasonal progression visible for similar looking
variation seasons. vegetation types.
Human influence & management
Animals Presence of domesticated animals. Absence of domesticated animals.
Animal Structures, enclosures, access roads etc indicate Human management structures are mostly absent.
infrastruc- active management.
ture
Short Clear geographically zoned schedules for plowing, Absence of imposed management infrastructure at
range mowing etc. the field scale.
manage-
ment
Long Infrastructure to serve multiple fields / properties Visually connected to natural landscape with little
range (e.g. access roads, watering lines etc). evidence of imposed management.
manage-
ment
Temporal Long mixed farming rotations, typically managed Continued grassland presence when inspecting sev-
manage- over 2-5 years. eral seasons.
ment
Contextual analysis (cultural criteria)
Proximity Co-location with cropping lands likely indicates Distance from human accessibility indicates more
intensive management. naturalness.
Seasonal Time between grasslands & crops is an important Absence of cropping post expected indicates natu-
changes cultural factor (2-5yrs) ral systems.
Regional Presence of livestock occurs with cultivated grass- Implicit cultural impressions are rooted in South
knowl- lands, particularly when animal containment is vis- America & Brazilian Portuguese.
edge ible.

6/28
Figure 2. Spatial distribution of tiles with available information (Single, 2 or more years) and examples of raw interpreted and
converted to points for training the prediction models

7/28
Table 3. Datasets of pre-existing reference samples harmonized to our classification taxonomy.

Number of
Datasets Spatial distribution Time period
individual samples
WorldCereal [33] Global 2016–2021 36,427,760
EuroCrops [34] Europe 2018–2021 13,484,591
MapBiomas Brazil [17] Brazil 2000–2018 1,103,003
GLanCE [18] Global 2000–2021 8,374,634
Land Use/Land Cover
Europe 2006–2018 989,892
Area Frame Survey (LUCAS [35])
Land Change Monitoring,
U.S. (CONUS) 2000–2018 341,943
Assessment, and Projection (LCMap [36])
G-GLOPS training dataset [37] Global 2021 8,269,554
Total 66,991,467

129 an independent binary classification model per grassland class, we kept only point samples with the 100% class proportion in our reference
130 set, aiming for predictions based on distinct classes.
131 For point samples visually interpreted in two years (i.e. different reference dates for Bing Maps and Google Maps), we implemented a
132 data augmentation approach to increase the number of samples in consecutive years in our model. Every point sample with the same class
133 according to Bing Maps and Google Maps, and less than 5 years of time difference, was replicated in all intermediate years. For example, a
134 point sample of cultivated grassland in 2010, according to Google Maps, and in 2014, according to Bing Maps, was replicated in 2011, 2012
135 and 2013. Assuming a minimum rotation period of 5 years for crops and grasslands [31], this approach resulted in approximately 300,000
136 additional samples, mostly located in Europe, the U.S., India and South America.
137 The point samples were filtered considering the disagreement between our reference classes and three global land cover products (i.e.
138 UMD GLAD GLCLUC [13], GLC_FCS30D [15] and ESA WorldCover 2020 [14]), from which we obtained the mapped classes for multiple
139 years (i.e. 2000, 2005, 2010, 2015 and 2020). All samples of cultivated grassland and natural/semi-natural grassland mapped as urban areas,
140 forest, cropland, water, snow, or wetlands were removed by at least two global products in two years. Likewise, all samples of other land
141 cover predicted as grassland, short vegetation or herbaceous by at least two global products across two years were removed (for the filtering
142 rules details, see table S3). This process removed 75,129 points (i.e. about 3% of the total), improving the overall quality of our training
143 data (specifically for augmented samples with crop-grassland rotation period less than 5 years) and resulting in 2,353,785 point samples
144 distributed across the time series 2000–2022 (see figures S1 and S2).

145 Harmonization of pre-existing reference samples


146 To comprehensively compare our global grassland maps with existing LULC mapping initiatives, we harmonized reference samples from 7
147 datasets, shown in table 3. This process involved translating the original LULC classifications of these datasets into our three classes (i.e.
148 cultivated grassland, natural/semi-natural grassland and other land cover).
149 The harmonization process relied on leveraging the original class definitions and expert knowledge to map LULC across different
150 datasets accurately. This involved meticulously comparing the definitions of LULC classes within each dataset with the classification scheme
151 described above. The crosswalk/class harmonization tables were implemented using Python computational notebooks and are available in
152 Zenodo [32]. As a result, we obtained 66,991,467 harmonized individual samples (unique points in geographical space and time), which were
153 used in our agreement assessment analyzes.

8/28
154 GLAD Landsat ARD-2
155 The primary EO data input for our spatiotemporal modeling was the global Landsat Analysis Ready Data developed by the Global Land
156 Analysis and Discovery Lab at the University of Maryland (GLAD ARD) [19]. GLAD ARD provides a 16-day time series of tiled Landsat
157 normalized surface reflectance from 1997 onward. The entire Landsat 5, 7, 8, and 9 Collection 2 USGS data archive was used to produce
158 the data set [38]. The Landsat data processing algorithm included per-pixel observation quality assessment, reflectance normalization, and
159 anisotropy correction. The Moderate Resolution Imaging Spectroradiometer (MODIS) MOD44C surface reflectance product was used as a
160 normalization target for a single-step reflectance bias and anisotropy correction. Each 16-day composite includes the best quality observation
161 and contains eight spectral bands (i.e. blue, green, red, Near-infrared — NIR, Short-wave infrared 1 — SWIR1, Short-wave infrared 2 —
162 SWIR2, and thermal) and a quality assessment band that flags clouds, cloud shadows, snow/ice, haze, water, and clear-sky land. Since
163 our reference samples are sparsely distributed over time, we decided to use GLAD ARD instead of the USGS Landsat collection to take
164 advantage of the consistent pixel values across different Landsat systems over the years, improving the temporal generalization of our models
165 and reducing the need of sampling all mapped periods.

166 Landsat temporal aggregation and imputation


167 To reduce the impact of cloud cover and enable the incorporation of intra-annual seasonality in our features, we aggregated the Landsat
168 ARD-2 time series (1997–2022) in bi-monthly temporal composites. For every GLAD tile (i.e. 1×1 geographic degree), we executed the
169 following steps [39]:

170 1. Removal of all pixels classified as cloud, cloud shadow, haze, cloud buffer, shadow buffer and shadow high likelihood according to
171 quality assessment band (mask values: 3,4,7,8,9,10);
172 2. Conversion of pixel values to 8-bit by linear normalization, resulting in values ranging from 0 to 250;
173 3. Temporal aggregation of all clear-sky pixels for a 2-month period using a weighted average by cloud_cover (estimated for each
174 date and tile);
175 4. The remaining data gaps were imputed using time-series reconstruction, relying solely on clear-sky pixels acquired on previous dates
176 (e.g. gaps in Jan–Feb, 2002 composite considered clear-sky pixels of 1997, 1998, 1999, 2000 and 2001). The imputed values were
177 derived using Seasonally Weighted Average Generalization (SWAG), which applied a vector of weights that prioritized pixel values
178 from the same bi-month period and previous years over those from neighboring regions or different bi-month periods [39].

179 Landsat-derived indices


180 In addition to the bi-monthly aggregates for the reflectance bands, we also incorporated several key vegetation and water indices as predictor
181 variables for modeling purposes. These indices include the Bare Soil Index (BSI) [40], Enhanced Vegetation Index (EVI) [41], the Modified
182 Normalized Burn Ratio (NBR2), also called Normalized Difference Tillage Index (NDTI) [42], the Normalized Difference Vegetation Index
183 (NDVI) [43], the Normalized Difference Water Index (NDWI) [44] and the near-infrared reflectance of vegetation (NIRv) [45]. Each of
184 these indices was derived from different linear combinations of the reflectance bands and provides unique information on vegetation health,
185 moisture content, severity of burns, and overall ecological conditions. We also included a temporal aggregated index, Bare Soil Fraction
186 (BSF) [46], which is used to capture processes that require a longer temporal frame for sensible quantification: it is determined by the
187 proportion of time the NDVI is <0.35 over the six bi-monthly aggregates [39]. In addition to spectral indices, we derived per-pixel Fraction
188 of Absorbed Photosynthetically Active Radiation (FAPAR) using its correlation with NDVI [47]. Table S4 summarizes the formulas for each
189 Landsat-derived index utilized in our modeling.

190 Land surface data


191 Land surface data was obtained from the MODIS Land Surface Temperature and Emissivity (LST&E) product, specifically MOD11A2 [20].
192 This product is available at a spatial resolution of 1 km and provides 8-day composite data that include both daytime and nighttime surface
193 temperatures. To adapt these data for our analysis, we aggregated the 8-day composites into monthly averages, facilitating the calculation
194 of long-term temperature trends for the period from 2000 to 2022. Specifically, we computed the median (50th quantile) and the standard

9/28
195 deviation for both daytime and nighttime temperatures on a monthly basis. This processing yielded a total of 48 input features for our
196 modelling. We also used MODIS water vapor data, specifically MCD19A2, which captures column water vapour above the ground using
197 near-IR bands. We aggregated the daily product into monthly composites, calculating the mean and standard deviation of positive, non-cloudy
198 observations. The remaining no-data values were imputed using a gap-filling algorithm; for more detailed information on the methodology
199 and data processing steps, refer to the Zenodo entry Parente et al. [48], and Consoli et al. [39].

200 Static raster datasets


201 The elevation data utilized in the modeling was obtained from the Ensemble Digital Terrain Model (EDTM) of the world at 30 m spatial
202 resolution [49]. This DTM results from integrating multiple sources, including ALOS AW3D [50], GLO-30 [51], MERIT DEM [52], and
203 various national DTMs. To quantify the isolation from urban areas and correlate it with the livestock management practices, we used a suite
204 of 10 global accessibility indicators calculated at 1 km resolution [53]; class 1 represents areas with travel times of less than 30 minutes to the
205 nearest city of 50,000 or more inhabitants, indicating high accessibility, while class 9 refers to areas where travel time exceeds 10 hours to
206 reach the nearest city of 50,000 or more inhabitants, indicating very low accessibility.
207 We also independently developed distance maps from permanent or seasonal inland water at 100 m resolution using a Landsat-derived
208 product specifically developed for inland waters [54]. Similarly, we produced maps of distances to areas classified by road density, ranging
209 from low to high, utilizing OpenStreetMap (OSM) data. We also calculated the geometric minimum and maximum temperature as geometric
210 transformations based on latitude, day of the year, and elevation [55]. This calculation considered both the minimum and maximum
211 temperature per month, resulting in 24 input features. These variables not only capture Earth’s geometry and temporal dynamics within a
212 year but also enable the model to differentiate between locations that, despite having similar long-term or monthly temperature profiles, are
213 distinct in their latitudinal positions or seasonal timing. This approach improves the model’s ability to discern and predict on the basis of
214 subtle climatic variations influenced by geographical and temporal factors.

215 Spatiotemporal model training


216 We modeled the grassland classes separately, training one model specialized in cultivated (i.e. binary classifier of cultivated grassland vs
217 other land cover) and another model specialized in natural/semi-natural grassland (i.e. binary classifier of natural/semi-natural grassland vs
218 other land cover). For each model, we ran a feature selection (i.e. Recursive Feature Elimination — RFE [56]), a hyperparameter tuning (i.e.
219 Successive Halving - SH [57]) and a comparison between three ML algorithms (i.e. Random Forest - RF [58], Gradient-boosted trees — GBT
220 [59] and Artificial Neural Network — ANN [60]).
221 Before modeling, we overlaid our point samples with the temporal and static EO data. The Landsat pixel values were associated with each
222 sample by spacetime overlay, matching the location (i.e. geographical coordinates) and the time period (i.e. year of reference) of each sample
223 with 84 Landsat composites in a specific year (i.e. seven reflectance bands and seven spectral indices for six bi-monthly aggregates). For
224 static layers (i.e. long-term MOD11A2 land surface temperature, long-term MCD19A2 water vapor, geometric temperature, static DTM, and
225 static distance maps of cities, roads, and water), the overlay considered only the sample locations, resulting in a total of 197 input features for
226 feature selection. The overlaid samples were then split into training and calibration, where 10% of samples from each visually interpreted tile
227 (i.e. 1×1 km) were randomly selected to compose the calibration set, resulting in 2,122,357 and 231,428 samples for training and calibration,
228 respectively. The calibration set was used to run the RFE and then SH, thus establishing the best features and hyperparameters to compare the
229 ML algorithms.
230 Our RFE [56] considered a standard RF model with 60 trees and default hyper-parameters (fitted using scikit-learn [61]), targeting
231 75 features as final selection (i.e. about 38% of the total number of features) and removing the four least important features per iteration
232 (according to gini importance). The best 75 features of each model, shown in Table S5, were then used to run SH, which considered the
233 log_loss metric [22] and five-fold spatial blocking cross-validation (CV - based on visually interpreted tiles — i.e. 1×1 km) for assessing
234 iteratively different combinations of hyper-parameters candidates bounded by a customized search space. Our SH started with 500 samples,
235 selecting the best candidates (i.e. dropping half of the less accurate candidates) and doubling the number of samples per iteration until
236 reaching the full set of calibration samples. After the last iteration, the hyper-parameters with best log_loss (i.e. lowest value), shown in

10/28
237 Table S6, were selected for each ML algorithm.
238 The comparison used the training set and the five-fold spatial blocking CV to estimate accuracy metrics adequate for probability output
239 (i.e. R2 logloss [62] and precision-recall curves [63]) for RF, GBT and ANN. For each algorithm, five ML models were trained using 80%
240 of samples (i.e. one fold) and 20% for validation in each iteration, resulting in an out-of-the-fold prediction for all samples. The blocking
241 strategy kept all samples from the same tile (i.e. 1×1 km) either in training or validation set, reducing the spatial correlation between both sets
242 and allowing for a more strict evaluation of the error estimate [64]. This analysis excluded the interpolated point samples. The best model
243 according to R2 logloss (i.e. highest value) was used to train two global models considering all points samples (i.e. 2,353,785 samples) and 102
244 features (i.e. union of the best-selected features — see Table S5). The global models were then used to predict (worldwide) cultivated and
245 natural/semi-natural grassland for all years of the time series.

246 Spatiotemporal prediction


247 Global predictions were produced per GLAD tile (i.e. 1×1 geographic degree) and on a yearly basis from 2000 to 2022, resulting in annual
248 per-pixel probabilities for each class of grassland at 30 m spatial resolution. In an effort to speed up this process, we did not predict pixels
249 mapped as deserts, stable tree cover, salt pan wetlands, stable snow and ocean water in all years between 2000–2020, according to the UMD
250 GLAD GLCLUC product (for a complete list of land cover classes see Table S7). Furthermore, we also excluded areas mapped as buildings
251 by the World Settlement Footprint in 2019, and by the evolution product, which covers every 5 years between 1990 and 2015 [65].
252 Our RF models were compiled to a native C binary using TL2cgen [66], reducing the prediction time by factor 3. After running the
253 predictions, the time-series of probabilities were smoothed out by a spatio-temporal filter, which considered a three-dimensional Savitzky-
254 golay — SG (polynomial order three and squared window with five pixels) to reduce the inter-annual variability in the prediction outputs. SG
255 is a robust filter capable of significantly reducing local noise/spikes without changing the main trend of the time-series [67]. The smoothed
256 probability time-series were then used to derive annual dominant grassland maps using a customized threshold for each class, according to
257 our spatial cross-validation. We considered the precision-recall curves to find probability thresholds where the recall (i.e. producer’s accuracy)
258 and precision (i.e. user’s accuracy) are balanced/equal.
259 All these processing steps ran on a High-Performance Computing (HPC) infrastructure and were distributed among the processing nodes
260 using SLURM [68] and Docker containers [69]. Approximately 120,960 CPU hours and 7.2 terabytes of RAM were used to produce the final
261 predictions. All predicted tiles were then used to create Cloud-Optimized GeoTIFF (COG) mosaics and made publicly available in Google
262 Earth Engine and the SpatioTemporal Asset Catalog (STAC).

263 Data Records


264 The global grassland maps described in this paper are available from 2000–2022 in COG (Cloud Optimized GeoTIFF) format under the
265 Creative Commons license CC-BY (see Fig. 3). A total of 69 global mosaics (i.e. 23 years for each time series) is available in the WGS84
266 Coordinate Systems (i.e. EPSG:4326) and pixel size equal to 0.00025 degrees. The grassland probability values range from 0–100, and
267 the class values used by the dominant maps are zero for other land cover, one for to cultivated grassland and two for natural/semi-natural
268 grassland.
269 All raster files are in unsigned 8-bit integer format and use 255 as no-data value, following a naming convention that organizes
270 the most important data properties in nine fields:

271 1. Project name: Global Pasture Watch (gpw)


272 2. Class name: cultivated grassland (cultiv.grassland), natural/semi-natural grassland (nat.semi.grassland) and domi-
273 nant grassland (grassland)
274 3. Procedure combination: Random forest (rf), Savitzky-golay (savgol) and balanced threshold (bthr)
275 4. Variable type: probability (p)
276 5. Spatial resolution: 30m
277 6. Begin of time reference: date of first Landsat composite used by the modeling (20220101)

11/28
278 7. End of time reference: date of last Landsat composite used by the modeling (20221231)
279 8. Spatial extent: global (go)
280 9. Coordinate system: World Geodetic System 1984, used in GPS (epsg.4326)
281 10. Version: v1

282 The COG files were uploaded to an S3 service, consuming a total of 4 terabytes of storage. They are publicly accessible through STAC
283 by HTTP range requests, enabling seamless and lazy loading access by GIS solutions (e.g. Quantum GIS, MapServer, GeoServer, etc)
284 and programming environments (e.g. JupyterLab, RStudio, etc). They are also accessible via Google Earth Engine and the Geo-wiki.org
285 platforms.

286 Technical Validation


287 Spatial cross-validation and feature importance
288 Our comparison results, shown in Table 4, revealed very similar R2 logloss values for tree-based algorithms (i.e. RF and GBT), while ANN
289 presented the lowest values for both classes of grasslands. We used the precision-recall curves to define probability thresholds that can
290 balance precision and recall (i.e. similar values) and maximize the F1 score [63]. ANN had the highest probability threshold, while GBT
291 had the lowest one. These thresholds were used to convert probabilities in dominant classes (e.g. all samples with predicted probabilities
292 greater than or equal to 0.32 were converted to “Cultivated grassland“ class), which were then used to estimate the F1 score. GBT presented
293 F1 scores slightly higher than RF, and ANN presented the lowest scores for both grass classes. As there were no significant differences in
294 accuracy between RF and GBT, we decided to use RF to train the final global models due to the speed-up possibility offered by TL2cgen [66].

Table 4. Comparison of ML algorithms derived by five-fold spatial blocking CV using 2,122,357 points samples. The
probability thresholds were defined based on a precision-recall curve aiming to maximise the F1 score

Cultivated grassland Natural/Semi-natural grassland


Balanced Balanced
ML algorithm R2 logloss F1 score R2 logloss F1 score
probability threshold probability threshold
Random Forest - RF 0.924 0.328 0.644 0.773 0.428 0.759
Gradient boosting trees - GBT 0.924 0.162 0.653 0.767 0.352 0.760
Artificial Neural Network - ANN 0.916 0.380 0.607 0.697 0.468 0.720

295 The accuracy matrix, derived using the probability thresholds shown in Table 4, presented higher accuracies for natural/semi-natural
296 grassland than cultivated grassland (see Table 5). The class other land cover had values greater than 0.90 in all accuracy metrics. In addition
297 to the massive number of points samples and robustness of the spatial blocking CV [64, 70] and sampling design (i.e. FSCS), the current
298 accuracy was based on 7,005 tiles where we had VHR imagery available for the labeling process. Tiles without reference labels might have
299 very specific grassland dynamics that have not been captured by our models and accuracy assessment. Furthermore, our reference data are
300 quite sparse in time, with 40% of tiles having a single year available for visual interpretation, and most of the samples obtained in 2009–2014
301 and 2019–2022 for Bing and Google Maps, respectively (see Fig. S2). This temporal sparsity makes inferences based on sample-based
302 annual areas currently not possible for our grassland classes, even that considering all years, the proportion of cultivated grassland and
303 natural/semi-natural grassland together reaches 32% (see Fig. S1).
304 To overcome these issues, work is ongoing to independently validate output layers (led by IIASA) based on a new set of reference
305 samples and a different group of analysts, following the good practices of evaluation for LULC products [71]. Visual interpretation has been
306 conducted on the Geo-Wiki platform considering the current class definitions/criteria and multiple satellite imagery to address the temporal
307 sparsity (e.g. Google Maps, Bing Maps, Landsat and Sentinel) [72]. This validation helps assess and measure concrete improvements in
308 the next versions of grassland maps since we can reinterpret our current training samples based on feedback and local knowledge without

12/28
Figure 3. Global grassland maps for 2000 and 2022 including dominant class and probabilities for cultivated and
natural/semi-natural grassland. 13/28
309 changing the independent validation samples. Additionally, we will evaluate the quality of our CV assessment, measuring how well our ML
310 models will perform on a new set of reference samples.
311 Feature importance of our RF models shows that SWIR1 is the most important Landsat band for identifying cultivated grassland, with
312 the highest importance for all bi-monthly periods (see Fig. 4a). The green and red bands, together with NDTI (Normalized Difference tillage
313 Index), are also important Landsat features and probably contribute to the distinction of cultivated grassland and croplands. The long-term
314 MODIS water vapor (December and February) and the MODIS daytime temperature (October and September) are the only coarser resolution
315 layers (i.e. 1 km) among the top-15 most important features. For natural/semi-natural grassland, eight of the 15 features are coarser resolution
316 layers, including several city accessibility maps [53], which are probably contributing to the identification of remote grassland areas (e.g.
317 nature reserves, semi-arid grasslands, tundra ecosystems). Nevertheless, red is the most important Landsat band for distinguishing this class
318 of grasslands, specifically the May to December (i.e. four bi-monthly periods — see Fig. 4b) seem to help the predictive mapping especially.

(a) Cultivated grassland RF (b) Natural / Semi-natural grassland RF


Landsat ARD-2 SWIR1 (Jul. & Aug.) Landsat ARD-2 red (May. & Jun.)

Landsat ARD-2 SWIR1 (Sep. & Oct.) Landsat ARD-2 red (Nov. & Dec.)

Landsat ARD-2 SWIR1 (May. & Jun.) Landsat ARD-2 red (Jul. & Aug.)

Landsat ARD-2 SWIR1 (Jan. & Feb.) Cities accessibility maps (20—50k pop.)

Landsat ARD-2 SWIR1 (Mar. & Apr.) Cities accessibility maps (10—20k pop.)

Landsat ARD-2 SWIR1 (Nov. & Dec.) Cities accessibility maps (1—5M pop.)

Landsat ARD-2 green (May. & Jun.) Landsat ARD-2 FAPAR (May. & Jun.)
Features

Landsat ARD-2 red (Jul. & Aug.) MCD19A2 long-term water vapour mean (Sep.)

Landsat ARD-2 green (Jul. & Aug.) Cities accessibility maps (50k—50M pop.)

MCD19A2 long-term water vapour mean (Dec.) Landsat ARD-2 red (Sep. & Oct.)

MOD11A2 long-term day-time temperature mean (Oct.) Cities accessibility maps (50—100K pop.)

Landsat ARD-2 NDTI (Sep. & Oct.) Cities accessibility maps (100—200K pop.)

MCD19A2 long-term water vapour mean (Feb.) Landsat ARD-2 NDWI (Jul. & Aug.)

Landsat ARD-2 NDTI (May. & Jun.) Cities accessibility maps (500K—1M pop.)

MOD11A2 long-term day-time temperature mean (Sep.) MCD19A2 long-term water vapour mean (Aug.)

0.00 0.01 0.02 0.03 0.04 0.00 0.01 0.02 0.03 0.04
Importance Importance

Figure 4. Top-15 most important features according to our global RF models for: (a) cultivated grassland, and (b)
natural/semi-natural grassland.

319 Agreement assessment


320 Our agreement assessment of the dominant grassland-class maps (cultivated and natural/semi-natural combined - Fig. 3) and harmonized
321 existing reference samples (Table 3) revealed higher precision (i.e. user’s accuracy) than recall (producer’s accuracy) in all datasets (see Fig.
322 5). In general, it indicates that our grassland predictions are more conservative and might not include regions defined as grassland/shrubs by
323 multiple LULC mapping initiatives. Globally, our dominant class maps have precision values higher than 0.7 and F1 scores of 0.79, 0.65 and
324 0.63 according to GLanCE, CGLS-LC and WorldCereal, respectively.
325 Specifically for GLanCE, the accuracy metrics were derived per continent, enabling cross-checking with continental and national datasets.
326 F1 score values greater than 0.8 were found for South America (GLanCE) and Brazil (MapBiomas), a key agricultural frontier with the
327 historical expansion of cultivated grassland [73]. Higher accuracy values were found for the U.S. (LCMAP CONUS) compared to North
328 America, indicating more accurate predictions for the country in relation to the rest of the continent. Oceania had similar accuracy values
329 compared to North America, which may be explained by similar patterns in their land cover footprint [74]. Asia presented the most balanced
330 precision and recall among all continents, remarkably similar to our CV values (5). In Europe, the f1 score was 0.64, 0.63 and 0.50 according
331 to GLanCE, EuroCrops and LUCAS, respectively, indicating less accurate predictions compared to other continents, with systematic omission
332 error (recall between 0.35 and 0.53). The low accuracy values obtained with LUCAS might indicate significant mismatches between grassland

14/28
Table 5. Accuracy matrix for the final RF models estimated by five-fold spatial blocking CV using 2,122,357 points samples.
The precision and recall were balanced considering the probability threshold 0.32 and 0.42 for cultivated grassland and natural
/ semi-natural grass, respectively.

Expected Recall (Producer’s acc.)


Cultivated grassland Other LC Total
Cultivated grassland 0.062 0.034 0.096 0.643
Predicted
Other LC 0.034 0.869 0.904 0.962
Total 0.096 0.904 1.000
Precision (User’s acc.) 0.644 0.962
Expected Recall (Producer’s acc.)
Natural/Semi-natural grassland Other LC Total
Natural/Semi-natural grass 0.202 0.064 0.266 0.758
Predicted
Other LC 0.064 0.670 0.734 0.913
Total 0.266 0.734 1.000
Precision (User’s acc.) 0.759 0.913

333 classification taxonomies [35]. The lowest accuracy values were obtained in Africa, and it is probably related to the widespread disagreement
334 among existing LULC datasets in the continent [75].
335 Considering the wide temporal coverage of GLanCE, we used it to conduct an annual agreement assessment of our dominant class maps.
336 Since its temporal distribution is not regular across the time series (with several samples having class labels for one to three years), this
337 analyze considered only samples with 10 or more years labeled between 2000–2018. We notice a minor increase in precision (i.e. 0.9394 and
338 0.931 on average for smoothed and non-smoothed probabilities, respectively) followed by a minor decrease in recall (i.e. 0.7410 and 0.7449 in
339 average for smoothed and non-smoothed probabilities, respectively) due to SG (Fig. 6). Combined with a visual assessment of probabilities,
340 this confirms that SG increases the spatiotemporal consistency of our predictions without significantly changing their accuracy. The accuracy
341 metrics remain stable throughout the years and show higher precision (i.e. user’s accuracy) than recall (producer’s accuracy) across all years,
342 revealing a systematic omission error (i.e. false negatives), rather than a commission error (i.e. false positives). This can be partially attributed
343 to the establishment of balanced probability thresholds independently for each class, which does not ensure comparable precision and recall
344 values for the combined classes. Compared to the naive threshold, on the other hand, (i.e. 0.5) the balanced thresholds increased the f1 score
345 by 0.1241 and recall by 0.1892, on average, while decreased the precision by 0.0369, on average (see Fig. S3). Additional strategies and
346 applications for grassland probability maps are discussed in further sections.

347 Comparison with other LULC maps


348 To complement our agreement assessment, we performed a spatial comparison between the grassland maps and 30 m global land cover
349 products, UMD GLAD GLCLUC [13] and the GLC_FC30 [15], respectively. For each grassland class (i.e. cultivated and natural/semi-
350 natural), we calculated the overlap with LULC classes from the products for 3 years (2000, 2010 and 2020). To allow for easier comparison,
351 we combined some of the classes (deciduous and broadleaf forest into a Forest class, for example) in each of the LULC products and
352 additionally combined any classes with less than 3% overlap with the grassland classes into the other class. With this comparison, we want to
353 identify potential confusion between our grassland predictions and unexpected LULC classes. For example, we expect our grassland classes
354 to overlap with the grassland class from GLC_FC30 rather than the forest class. The comparisons revealed that the grassland proportions do
355 not change over time, so we show only three years out of 20.
356 Comparison between UMD GLAD GLCLUC and our grassland classes revealed that most of the overlap occurs with the short vegetation

15/28
Grassland agreement assessment
GLANCE (South America)
GLANCE (Global)
LCMAP CONUS (U.S.)
Harmonized reference samples MapBiomas (Brazil)
GLANCE (Asia)
GLANCE (Oceania)
GLANCE (North America)
CGLS-LC (Global)
GLANCE (Europe)
EuroCrops (Europe)
WorldCereal (Global)
LUCAS (Europe)
GLANCE (Africa)
0.0 0.2 0.4 0.6 0.8 1.0
F1 score Precision Recall

Figure 5. Agreement assessment of grassland class (i.e. cultivated and natural/semi-natural grassland combined) based on
harmonized existing reference datasets and and sorted ascending by f1 score.

357 class (71% for cultivated and 78% for natural/semi-natural), with croplands (16% for cultivated) and with wet short vegetation (16% for
358 natural/semi-natural). Confusion between cultivated grassland and croplands is expected, as these classes may have very similar spectral-
359 temporal responses in EO imagery [76, 33]) and overlapping taxonomies (e.g. hay is a type of grass that is planted but falls outside our
360 definition of cultivated grasslands). The comparison between GLC_FC30 and our grassland classes revealed that most of the overlap occurs
361 with grasslands (24% for cultivated and 27% for natural/semi-natural), rainfed cropland (21% for cultivated), herbaceous cover cropland
362 (27% for cultivated), shrubland (11% for cultivated and 22% for natural/semi-natural), and sparse vegetation (21% for natural/semi-natural).
363 There was unexpected overlap between grassland and forest (14% for cultivated and 12% for natural/semi-natural).
364 However, comparison between our predictions and 30 m products time-series of land cover is limited because our grassland classes are
365 defined based on the use and overlap of 3+ classes (e.g. grassland, shrubland, short vegetation) in either of the two LULC legends. The only
366 global grassland products we can compare with our predictions are coarse resolution, such the 10 km pasture map of the world for the year
367 2000 [9] and the HILDA+ distribution of pasture/rangeland and unmanaged grass/shrubland at 1 km resolution [10] (see Fig. 7). Comparing
368 our predictions of cultivated grassland, in general, shows a good match, especially with the global pastureland map by Ramankutty et al., [9];
369 when looking more closely, it seems that the previous products miss some smaller patches where we are certain they can be classified as
370 pastures, but were probably difficult to distinguish from other cropland similar to them or were just too small for resolution of 1 km.
371 A comparison between HILDA+ and our grassland predictions reveals similar patterns of overlap as described above; however, in
372 this case, we also wanted to assess if there are grassland areas that we are missing (as demonstrated by the accuracy assessment based
373 on the GLANCE training dataset) and found that 11% and 12% of our other land cover class fall within areas classified in HILDA + as
374 pasture/rangeland and unmanaged grass/shrubland, respectively. Moreover, 6% of our other land cover class falls within the pasture class

16/28
Grassland agreement assessment based on GLANCE training dataset
1.00

0.95

0.90

0.85

0.80

0.75

0.70
0 2 4 6 8 0 2 4 6 8
200 200 200 200 200 201 201 201 201 201
F1 score (SG) Precision (SG) Recall (SG)
F1 score (No-SG) Precision (No-SG) Recall (No-SG)

Figure 6. Agreement assessment of grassland class (i.e. cultivated and natural/semi-natural grassland combined) based on
GLANCE training dataset. The GLANCE classes grassland (12), shrub (10) and moss/lichen (13) were reclassified to grassland
for matching with our legend. All metrics were derived for smoothed probabilities (i.e. Savitzky-golay - SG) and non-smoothed
(i.e. No-SG) considering balanced thresholds of 0.32 and 0.42 for cultivated and natural/semi-natural grassland, respectively.

375 for the year 2000 of Ramankutty et al. [9] map. While some of this overlap can be explained by the difference in spatial resolution between
376 the two products (30 m vs 10 km), some of it is due to the under-prediction of the extent of grasslands in our product. On the other hand,
377 because our analysis is not limited to pasturelands, the extent of our natural grasslands far exceeds the extent of pasturelands as reported by
378 Ramankutty et al. [9].

379 Usage Notes


380 Grassland probability maps
381 The main data output described in this paper is the time series of probabilities for two classes of grasslands (i.e. cultivated and natural/semi-
382 natural representing the end members of a spectrum of grassland definitions, selected primarily based on the capacity of identifying them in
383 VHR imagery), estimated independently by global RF models. In general, our predictions are able to capture the expansion of cultivated
384 grassland over different types of native vegetation in tropics (see Fig. 8a, 8c and 9), and distinguish between grassland and cropland in, for
385 example; Europe (Fig. 8b), Asia (Fig. 8d) and Australia (Fig. 8e) over multiple years.
386 Global modeling enables custom thresholds for converting probability values into dominant classes seamlessly and consistently, once all
387 pixels are predicted using the same model for all years across the world. To demonstrate this application, we derived global maps for dominant
388 classes considering balanced probability thresholds, where precision and recall have similar values according to our five-fold spatial blocking
389 cross-validation (CV) (i.e. 0.38 for Cultivated grassland and 0.42 for Natural/Semi-natural grassland), resulting in more area mapped as
390 grassland (both classes combined) compared to a naive threshold (i.e. 0.5 — see Fig. S3). However, the assessment with existing independent
391 reference sample datasets consistently showed greater precision than recall (i.e. more omission than commission error for dominant classes),
392 which can be partly explained by the inherent limitations in harmonizing multiple grassland definitions with our classification taxonomy. The

17/28
Figure 7. Comparision of pastureland distribution map produced by Ramankutty et al. (2008) [9], land cover classes at 1 km
resolution based on HILDA+ data set [10], and our predictions for cultivated and natural / semi-natural grassland at 30 m
resolution focused in: A) Kazakhstan, B) Australia, C) Uruguay, D) Ireland / UK and E) South West Africa.

18/28
393 independent accuracy assessment paired with the visual comparison with existing land cover products have shown that, most likely, the maps
394 for dominant classes are providing a conservative estimate for global grassland areas. Users of dominant class maps should additionally note
395 that our global thresholds were derived from ~70% of total tiles (i.e. 1×1 km) determined by our sampling design and may not cover specific
396 grassland regions where VHR imagery was not available. Additionally, our predictions were based on independent ML models, which treated
397 each class separately and resulted in several grassland areas mapped simultaneously as cultivated and natural/semi-natural after applying the
398 balanced probability threshold (See Fig. 8). As natural/semi-natural grasslands reached a higher accuracy than cultivated grassland, pixels
399 that reached the required threshold in both classes were assigned the natural/semi-natural class over the cultivated one, which additionally
400 assumes a position in line with the precautionary principle for monitoring global natural/semi-natural grasslands [77].
401 Our mapping strategy has the main aim of providing probabilities that allow the production of customized maps of dominant grassland
402 classes (as demonstrated in the current study) and empower users to define their own decision and integration rules (e.g. probability threshold,
403 class priority, other land cover masks). For example, a user interested in South African grasslands can select a specific probability threshold
404 based on national reference samples, prioritize cultivated over natural/semi-natural grasslands and mask areas mapped as cropland by existing
405 land cover maps. In this way, the global maps provided here constitute an integral component of a broader framework led by GPW focusing
406 on grassland, pastures, and livestock monitoring. Some of the potential uses identified in project conception which are aimed to serve a wide
407 range of organizations and user communities at global, national, and local scale, include the following:

408 • Precision-recall calibration: Reference grassland samples, including in-situ data, can be used to estimate precision-recall curves for
409 target areas (e.g. watersheds, biomes, administrative areas), enabling the development and use of locally calibrated thresholds. Such
410 local probability thresholds would necessarily differ from those found in our global analysis (i.e. 0.38 for Cultivated grassland and
411 0.42 for Natural/Semi-natural grassland), and are likely to result in grassland maps which more accurately reflect the target local area.
412 In addition to balancing precision and recall, other criteria could be used to define the threshold, minimizing the error of omission, for
413 example, based on the Murashkin et al. [78] method.

414 • Area estimation calibration: Known or estimated quantities of cultivated grassland and natural/semi-natural grassland in an
415 administrative area, for example, through reports or census results, can be used to derive thresholds that explicitly enforce correct and
416 spatial class proportions. Recent findings suggest that this can be done in a way that actually modestly improves overall map accuracy,
417 especially in parts of the map where classes are mixed or atypical in the feature space [79], which might be particularly useful to
418 match grazing areas with livestock census records in the context of the Gridded Livestock of the World product [80].

419 • Land cover primitives: Combined with other land cover products, probability maps can be used as “primitives”/ which are considered
420 as building blocks for the construction of ensemble land cover products. “Primitives” represent raw information needed to make
421 decisions within a dichotomous key applied to land cover typologies, and recent findings have shown consistent and promising
422 results through an implementation that assumes RF probabilities as land cover primitives [81]. In addition to probabilities, dominant
423 land cover classes from existing products (e.g. GLanCE30 [82], GLC FCS30 [83], MapBiomas [17]) can be used as “primitives” if
424 converted to indicators (i.e. binary rasters); weighted by expert-based rules and averaged by standardization fractions that sum up
425 100% amongst all inputs. Although this possibility can take advantage of several land cover products in a holistic and multi-scale way;
426 the process of legend harmonization amongst the classes might constitute an undefined source of uncertainty and requires further
427 investigation.

428 Current limitations and mapping feedback


429 Despite the flexibility provided by the probability maps, we note several limitations/issues in our grassland predictions. In North America,
430 riparian areas and other land uses, such as plantations and mines (which can present herbaceous vegetation in the early stages of development),
431 depict high probability values in cultivated grassland predictions, perhaps due to the shorter temporal nature of the vegetation and proximity
432 to disturbed / bare soils. We also found some areas of significant confusion between cultivated grasslands and croplands. For example,
433 in the state of Montana, USA, cropland areas located on historical prairie areas have high values in the natural/semi-natural class (high

19/28
Figure 8. Examples of predicted probabilities for cultivated and natural/semi-natural grassland in A) Paraguay (-22.2377,
-60.4928); B) Scotland - UK (55.9314, -2.5397); C) Democratic Republic of the Congo — DRC (-7.6433, 23.6100); D)
Kazakhstan (50.8612, 57.8807); and E) Australia (-25.6407, 146.6135). Landsat ARD-2 images are shown as false colour
composite (NIR, SWIR-1 and red) for the year of grassland predictions, highlighting healthy vegetation in bright green, sturdy
vegetation beams in bright red and soils with a mauve. The composites are from Mar. & Apr. (all years) in A) Paraguay and B)
Scotland; Mar. & Apr. 2002 and Nov. & Dec. 2012 in C) DRC; Aug. & Sep. 2015 and May. & Jun. 2020 in D) Kazakhstan;
and May & Jun. 2006 and Mar. & Apr 2017 in E) Australia.

20/28
Figure 9. Our predictions of probabilities for cultivated grassland for 2000, 2010 and 2020 at 30 m spatial resolution (below)
for an area in Brazil (close to Serra Morena) as compared to the Google Time lapse images (above); based on the AirbusMaxar
Technologies high resolution images.

434 probabilities) but not in cultivated grassland predictions. The same issue appears in the arid and hyperarid landscapes of northern Africa and
435 the Arabian Peninsula, where herbaceous croplands (irrigated pivot agriculture), mixed crop-livestock systems, and tree crops are misclassified
436 as cultivated grasslands. These regions also presented some cultivated grassland artefacts in the predictions, especially alongside roads
437 and urban areas. Sudan, Niger, Uganda, Kenya, and Mali have several cropland areas with high probability values for natural/semi-natural
438 grassland, which corroborates the lower accuracy values in Africa found by our assessment of agreement (Fig. 5). It is considered that
439 additional regional expertise in the training samples are required in these regions in particular. In the state of Western Australia, New Zealand,
440 the center of Bolivia, and the state of Mato Grosso (Brazil), it appears that large cropland areas are misclassified as cultivated grasslands
441 suggesting that that attention to the temporal component of grassland /cropland rotation could yield improved accuracy. Specifically, in
442 eastern Madagascar, extensive areas of shifting agriculture have high values for cultivated grassland probabilities — this is probably partially
443 correct and influenced by the regional cultural knowledge of the visual interpreters.
444 Our predictions for cultivated grasslands presented high probability values for what is more likely considered extensive areas of
445 natural/semi-natural grassland in some regions, including western Ireland (through length of grassland persistence rather than potential natural
446 vegetation) and Kazakhstan; south of Russia; eastern Mongolia; parts of Uruguay; Drakensberg mountains (South Africa); south-western
447 Caucasus and adjacent North-Eastern Turkey; Sumba Island (Indonesia); Barkly Tablelands of Australia’s Northern Territory; western African
448 Sahel belt; Northern Papua New Guinea and North-Eastern Nicaragua. However, prioritizing natural/semi-natural classes over cultivated

21/28
Annual probability maps (2000—2022) Annual probability maps (2000—2022)
(a) (b)
Cultivated grassland Natural / semi-natural grassland Cultivated grassland Natural / semi-natural grassland

Inventory / Census data


(FAO & national statistics) High resolution LCLU products
(harmonization to a common legend)

Match grazing area and number Derive land cover primitives including ESA WorldCover
of livestock animals. indicators, fractions and quality flag 2020+ (12)

Livestock density maps


(per admin. unit) Data fusion
(standardize input to sum up to 100%) GLAD GLCLUC
2000–2020 (12)

Area-adjustment downscaling
(preserving total sums) GLC_FCS30D
1985–2022 (35)

Active grazing areas at 30-m Livestock density at 1-km Ensemble Land Cover product
(annual maps 2000–2022) (annual maps 2000–2022) (15 harmonized classes 2000–2022+)

GLanCE30
2001–2019 (10)

Figure 10. Future GPW applications for the produced grassland probability maps: (a) to delineate for example active grazing
areas matching with census estimates and help produce more reasonable livestock density maps [80], (b) to help produce global
time-series of ensemble land cover products harmonizing and combining multiple existing products (Esa WorldCover [14]
UMD GLAD GLCLUC [13], GLC FCS30 [83] and GLanCE30 [82]).

449 grassland solved this issue in several of the regions mentioned, indicating that the visual interpretation criteria relying on human management
450 indicators, particularly in terms of visible animal infrastructure; may require additional cultural expertise to adjust to various grassland
451 contexts across the world. Specifically in western Ireland, the natural/semi-natural grassland predictions presented low probability values, and
452 most of the grassland was classified as cultivated. Despite the likely biome state of Ireland for pre-human intervention being dominantly
453 forest with large animal induced clearings, the training data exlicitely contained no such historical information. This indicates either an
454 implicit expectation in the training data related to biophysical conditions, or, more likely, the high density of cities in the region, which affects
455 the accessibility maps of cities, a quite important feature / spatial layer of the RF model specialized in natural/semi-natural grassland (see
456 Fig. 4).
457 In general, we believe that the grassland extent is under-predicted in southeastern Africa (mainly in Zimbabwe and Mozambique) as well
458 as eastern Australia (mainly in the shrublands and woodlands of the Mulga ecoregion), due to the presentation of low probability values
459 for both grassland classes. On the other hand, over-prediction is apparent in extensive areas with high values of cultivated grassland in
460 intensively grazed areas with partially lost woody vegetation in the Western African Sahel belt, the Northern-Central African savanna-desert
461 transition zone (Eastern Chad/Western Sudan), farmland mosaics in North-Eastern Uganda’s savanna/grassland region, Eastern Madagascar’s
462 deforested East Coast, and non-cultivated (low-input) pastures in deforested regions of the Selva Maya (Chiapas, Petén) and Amazonian
463 deforestation frontiers. Such over- and under-predictions are not trivial to resolve in the face of RF as a complex and often obscure prediction
464 system, as we are not sure these outcomes happen because of extrapolation problems, noise/limited detectability in the Landsat images, fuzzy
465 definition of grassland classes, or simply a lack of training points in these areas. Our best approach moving forward is to simply increase
466 the representation of regional cultural knowledge in these areas and assess the accuracy of outcomes against a gathering of socio-cultural
467 knowledge. Ultimately it is the utility of products, rather than the particular method or mapping accuracy, which out to be the judge of utility.

22/28
468 Nevertheless, we can reasonably assume that some of these issues are related to very similar values of two or more classes in the feature
469 space (limited detectability in Landsat images), where our ML models did not allow separation among areas with distinct LULC dynamics as
470 embodied in our visually interpreted training dataset. It appears that intensively managed grasslands, with high homogeneity under many
471 conditions, have a high chance of being confused with other classes that have very similar spectral properties, such as urban mosaics (i.e.
472 buildings, sparse trees and grass fields with different densities) or (greenish) croplands with similar vegetation height and spatial configuration
473 (such as cereal crops [76, 33]). Less intensively cultivated grasslands, where more diverse plant species can be found and where the landscape
474 may not be very regular, are easily confused with grasslands that are not cultivated or (semi) natural herbaceous vegetation, in general [75]. In
475 addition, the spectral signal of cultivated grasslands can not be as clearly distinguished from natural/semi-natural grassland as it could be
476 from croplands, where there are clear breaks in vegetation growth in cases where multi-temporal clear-sky images are available [84].

477 The distinction between cultivated and natural/semi-natural grasslands has been notoriously difficult to map in the past [85, 16, 17], which
478 has also affected our reference data collection and harmonization process. Hence, our reference labelling protocol relied on more indirect
479 indicators of management, such as fences and other typical infrastructure, hay bales, machine presence, and even animal presence in the field
480 or geometric shapes of the landscape. This may lead to an underestimation of signs of cultivation that may be less intensive or where VHR
481 imagery was not available at the time of management practices. Regarding our harmonization process, the description or labelling among
482 different datasets is a limiting factor. Since we analyzed samples from a wide range of sources, all with their own ontological definitions and
483 classification taxonomy, harmonization was possible only based on rough estimations. Even when acknowledging language and conceptual
484 differences; some fundamental differences between scientific domains/schools of thought/cultural views may also result in ambiguous terms
485 or descriptions. For example, while it may be called “rangeland” in the U.S., the same concept would be called “pasture” in Europe, while a
486 “pastagem” (the literal translation of ’pasture’) would be regarded as a cultivated grassland in Brazil. Often, the finer distinctions of how
487 dataset creators perceive and interpret mental concepts whilst creating the training dataset, is missing from their fundamental description,
488 making it harder for downstream applications to form a proper semantic match across many datasets. Due to these challenges, we have
489 attempted to be as clear and as transparent as possible in our visual interpretation criteria and to plan for active inclusion of regional cultural
490 knowledge.

491 One possible way to resolve such semantic/ontological issues is through international registers where land cover and land use class-
492 es/systems are unequivocally specified and illustrated with decision trees and photographs accompanied by multi-lingual descriptions.
493 However, for this, the international community would have not just to provide such context, but to also have to agree on some thresholds
494 and recommendations, such as the minimum number of livestock units per ha in relation to productivity, the minimum number of years
495 under some land use system, the duration of fallow periods, and a list of recommended indicator species for cultivated grasslands with the
496 presumption of multi-spectral imaging becoming widely available. Disregarding such forward looking assertions, our predicted grassland
497 distribution for 2000–2022 aims to become an integral component of a broader framework of monitoring products to be produced by GPW
498 and will also include aspects of grassland management, condition and emissions. The data set presented here is the first essential step toward
499 these future products, serving as both a pioneering demonstration and a foundation for ongoing refinements.

500 Users need to be aware of the limitations of our produced datasets and the known issues discussed in this section; whilst considering
501 them carefully to ensure appropriate use of maps at this initial prediction stage. Alongside noting shortcomings in prediction products, we
502 are working actively to address most of the these issues through mapping feedback campaigns on the Geo-Wiki platform, where experts
503 and/or users with local knowledge of LULC classes can visualize and interact with the most recent versions of our products. Additionally, all
504 global products used in our comparison analyzes (UMD GLAD, GLC FCS30D, HILDA+, Ramankutty et al., 2008 [9]) have been uploaded
505 on the platform, supporting users in the provision of feedback regarding overall agreement, spatio-temporal consistency, and over- and
506 under-predicted grassland areas of any classification type. Solicited feedback via Geo-Wiki may consist of drawing polygons in designated
507 or non-designated areas, concentrating on the differentiation of (i) grassland or non-grass cover and (ii) cultivated or natural/semi-natural
508 grassland. In order to improve the consistency of the mapping feedback and avoid ambiguities in visual interpretation and classification, users
509 are provided with sufficient materials to follow the predefined labeling protocols. The consortium considers that systematically collected
510 feedback, together with multiple partnerships and wide stakeholder participation, will lead to the most efficient path for improving future

23/28
511 versions of the GPW products, supporting the development of fit-for-purpose applications able to advance the protection, restoration and
512 sustainable use of global grasslands. We encourage and welcome all readers of this publication to contribute knowledge to this effort.

513 Code availability


514 The global maps of grasslands are publicly available as Cloud-Optimized GeoTIFF (COG) files through SpatioTemporal Asset Cata-
515 log (STAC)) and Google Earth Engine (GEE). Users can provide feedback and report classification errors for dominant class maps in
516 Geo-Wiki (geo-wiki.org). We provide a public web interface for rapid exploration of the dataset and comparison with existing LULC
517 products at GEE. All processing steps presented in this paper were implemented in Python, and the source code is publicly available
518 (MIT License) at: https://github.com/wri/global-pasture-watch. However, to ensure reproducibility, we have archived the reference sam-
519 ples and trained model (including example code for running inference) in Zenodo at: https://doi.org/10.5281/zenodo.11281157 and
520 https://doi.org/10.5281/zenodo.11280849.

521 References
522 1. Bardgett, R. D. et al. Combatting global grassland degradation. Nat. Rev. Earth & Environ. 2, 720–735, 10.1038/s43017-021-00207-2
523 (2021).
524 2. O’Mara, F. P. The role of grasslands in food security and climate change. Annals Bot. 110, 1263–1270, 10.1093/aob/mcs209 (2012).
525 3. Klein Goldewijk, K., Beusen, A., Doelman, J. & Stehfest, E. Anthropogenic land use estimates for the Holocene–HYDE 3.2. Earth
526 System Science Data 9, 927–953, 10.5194/essd-9-927-2017 (2017).
527 4. Chang, J. et al. Climate warming from managed grasslands cancels the cooling effect of carbon sinks in sparsely grazed and natural
528 grasslands. Nat. Commun. 12, 118, 10.1038/s41467-020-20406-7 (2021).
529 5. Herrero, M. et al. Biomass use, production, feed efficiencies, and greenhouse gas emissions from global livestock systems. Proc. Natl.
530 Acad. Sci. 110, 20888–20893, 10.1073/pnas.1308149110 (2013).
531 6. Phelps, L. N. & Kaplan, J. O. Land use for animal production in global change studies: Defining and characterizing a framework. Glob.
532 change biology 23, 4457–4471, 10.1038/nature20584 (2017).
533 7. Sulla-Menashe, D., Gray, J. M., Abercrombie, S. P. & Friedl, M. A. Hierarchical mapping of annual global land cover 2001 to present:
534 The MODIS Collection 6 Land Cover product. Remote. Sens. Environ. 222, 183–194, 10.1016/j.rse.2018.12.013 (2019).
535 8. Plummer, S., Lecomte, P. & Doherty, M. The ESA Climate Change Initiative (CCI): A European contribution to the generation of the
536 Global Climate Observing System. Remote. Sens. Environ. 203, 2–8, 10.1016/j.rse.2017.07.014 (2017).
537 9. Ramankutty, N., Evan, A. T., Monfreda, C. & Foley, J. A. Farming the planet: 1. geographic distribution of global agricultural lands in
538 the year 2000. Glob. biogeochemical cycles 22, 10.1029/2007GB002952 (2008).
539 10. Winkler, K., Fuchs, R., Rounsevell, M. & Herold, M. Global land use changes are four times greater than previously estimated. Nat.
540 communications 12, 2501, 10.1038/s41467-021-22702-2 (2021).
541 11. Brown, C. F. et al. Dynamic World, Near real-time global 10 m land use land cover mapping. Sci. Data 9, 251, 10.1038/
542 s41597-022-01307-4 (2022).
543 12. Friedl, M. A. et al. Medium Spatial Resolution Mapping of Global Land Cover and Land Cover Change Across Multiple Decades From
544 Landsat. Front. Remote. Sens. 3, 894571, 10.3389/frsen.2022.894571 (2022).
545 13. Potapov, P. et al. The global 2000-2020 land cover and land use change dataset derived from the landsat archive: first results. Front.
546 Remote. Sens. 3, 856903, 10.3389/frsen.2022.856903 (2022).
547 14. Zanaga, D. et al. ESA WorldCover 10 m 2020 v100, 10.5281/zenodo.5571936 (2021).
548 15. Zhang, X. et al. GLC_fcs30d: the first global 30 m land-cover dynamics monitoring product with a fine classification system for the
549 period from 1985 to 2022 generated using dense-time-series Landsat imagery and the continuous change-detection method. Earth Syst.
550 Sci. Data 16, 1353–1381, 10.5194/essd-16-1353-2024 (2024).

24/28
551 16. Jones, M. O. et al. Innovation in rangeland monitoring: annual, 30 m, plant functional type percent cover maps for U.S. rangelands,
552 1984–2017. Ecosphere 9, e02430, 10.1002/ecs2.2430 (2018).
553 17. Souza, C. M. et al. Reconstructing Three Decades of Land Use and Land Cover Changes in Brazilian Biomes with Landsat Archive and
554 Earth Engine. Remote. Sens. 12, 2735, 10.3390/rs12172735 (2020).
555 18. Stanimirova, R. et al. A global land cover training dataset from 1984 to 2020. Sci. Data 10, 879 (2023).
556 19. Potapov, P. et al. Landsat analysis ready data for global land cover and land cover change mapping. Remote. Sens. 12, 426, 10.3390/
557 rs12030426 (2020).
558 20. Wan, Z., Hook, S. & Hulley, G. MODIS/Terra Land Surface Temperature/Emissivity 8-Day L3 Global 1km SIN Grid V061, 10.5067/
559 MODIS/MOD11A2.061 (2021).
560 21. Lyapustin, A. & Wang, Y. MODIS/Terra+Aqua Land Aerosol Optical Depth Daily L2G Global 1km SIN Grid V006, 10.5067/MODIS/
561 MCD19A2.006 (2018).
562 22. Witjes, M. et al. A spatiotemporal ensemble machine learning framework for generating land use/land cover time-series maps for Europe
563 (2000–2019) based on LUCAS, CORINE and GLAD Landsat. PeerJ 10, e13573, 10.7717/peerj.13573 (2022).
564 23. Ma, T., Brus, D. J., Zhu, A.-X., Zhang, L. & Scholten, T. Comparison of conditioned Latin hypercube and feature space coverage
565 sampling for predicting soil classes using simulation from soil maps. Geoderma 370, 114366, 10.1016/j.geoderma.2020.114366 (2020).
566 24. ESA Climante Change initiative. Global Land Cover time-series v2.1.1 (1992 - 2015). http://maps.elie.ucl.ac.be/CCI/viewer/download.
567 php (2021).
568 25. European Space Agency. Copernicus GLO-90 Digital Elevation Model, 10.5069/G9028PQB (2021).
569 26. Amatulli, G., McInerney, D., Sethi, T., Strobl, P. & Domisch, S. Geomorpho90m, empirical evaluation and accuracy assessment of
570 global high-resolution geomorphometric layers. Sci. Data 7, 162, 10.1038/s41597-020-0479-6 (2020).
571 27. Didan, K. MODIS/Terra Vegetation Indices 16-Day L3 Global 250m SIN Grid V061, 10.5067/MODIS/MOD13Q1.061 (2021).
572 28. Karger, D. N. et al. Climatologies at high resolution for the earth’s land surface areas. Sci. data 4, 1–20, 10.1038/sdata.2017.122 (2017).
573 29. Pekel, J.-F., Cottam, A., Gorelick, N. & Belward, A. S. High-resolution mapping of global surface water and its long-term changes.
574 Nature 540, 418–422, 10.1038/nature20584 (2016).
575 30. Allen, V. G. et al. An international terminology for grazing lands and grazing animals. Grass forage science 66, 2, 10.1111/j.1365-2494.
576 2010.00780.x (2011).
577 31. Upcott, E. V., Henrys, P. A., Redhead, J. W., Jarvis, S. G. & Pywell, R. F. A new approach to characterising and predicting crop rotations
578 using national-scale annual crop maps. Sci. Total. Environ. 860, 160471, 10.1016/j.scitotenv.2022.160471 (2023).
579 32. de Oliveira, B. S., Teles, N. M., Mesquita, V. V., Parente, L. L. & Ferreira, L. G. Integrated Approach to Global Land Use and Land
580 Cover Reference Data Harmonization, 10.5281/zenodo.11246630 (2024).
581 33. Van Tricht, K. et al. Worldcereal: a dynamic open-source system for global-scale, seasonal, and reproducible crop and irrigation mapping.
582 Earth Syst. Sci. Data 15, 5491–5515, 10.5194/essd-15-5491-2023 (2023).
583 34. Schneider, M., Schelte, T., Schmitz, F. & Körner, M. Eurocrops: The largest harmonized open crop dataset across the european union.
584 Sci. Data 10, 612, 10.1038/s41597-023-02517-0 (2023).
585 35. d’Andrimont, R. et al. Harmonised lucas in-situ land cover and use database for field surveys from 2006 to 2018 in the european union.
586 Sci. data 7, 352, 10.1038/s41597-019-0340-y (2020).
587 36. Stehman, S. V., Pengra, B. W., Horton, J. A. & Wellington, D. F. Validation of the us geological survey’s land change monitoring,
588 assessment and projection (lcmap) collection 1.0 annual land cover products 1985–2017. Remote. sensing environment 265, 112646,
589 10.1016/j.rse.2021.112646 (2021).
590 37. Tsendbazar, N. et al. Product validation report (d12-pvr) v 1.1 (2021).
591 38. Crawford, C. J. et al. The 50-year landsat collection 2 archive. Sci. Remote. Sens. 8, 100103, 10.1016/j.srs.2023.100103 (2023).
592 39. Consoli, D. et al. A computational framework for processing time-series of earth observation data based on discrete convolution:
593 global-scale historical landsat cloud-free aggregates at 30 m spatial resolution. PeerJ in review, 10.21203/rs.3.rs-4465582/v1 (2024).

25/28
594 Preprint posted at Research Square.
595 40. Roy, P., Sharma, K. & Jain, A. Stratification of density in dry deciduous forest using satellite remote sensing digital data—an approach
596 based on spectral indices. J. biosciences 21, 723–734 (1996).
597 41. Huete, A. et al. Overview of the radiometric and biophysical performance of the modis vegetation indices. Remote Sensing of Environment
598 83, 195–213 (2002).
599 42. Van Deventer, A., Ward, A., Gowda, P. & Lyon, J. Using thematic mapper data to identify contrasting soil plains and tillage practices.
600 Photogramm. engineering remote sensing 63, 87–93 (1997).
601 43. Tucker, C. J. Red and photographic infrared linear combinations for monitoring vegetation. Remote sensing of Environment 8, 127–150
602 (1979).
603 44. Gao, B.-C. NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space. Remote Sensing of
604 Environment 58, 257–266 (1996).
605 45. Badgley, G., Field, C. B. & Berry, J. A. Canopy near-infrared reflectance and terrestrial photosynthesis. Sci. advances 3, e1602244
606 (2017).
607 46. Castaldi, F., Chabrillat, S., Don, A. & van Wesemael, B. Soil organic carbon mapping using lucas topsoil database and sentinel-2 data:
608 An approach to reduce soil moisture and crop residue effects. Remote. Sens. 11, 2121 (2019).
609 47. Robinson, N. P. et al. Terrestrial primary production for the conterminous United States derived from Landsat 30 m and MODIS 250 m.
610 Remote Sensing in Ecology and Conservation 4, 264–280 (2018).
611 48. Parente, L., Simoes, R. & Hengl, T. Monthly aggregated Water Vapor MODIS MCD19A2 (1 km): Long-term data (2000-2022),
612 10.5281/zenodo.8192544 (2023).
613 49. Ho, Y. F., Hengl, T. & Parente, L. Ensemble Digital Terrain Model (EDTM) of the world (1.1) (OpenGeoHub foundation, Doorwerth,
614 NL, 2023).
615 50. Tadono, T. et al. Generation of the 30 m-mesh global digital surface model by alos prism. The international archives photogrammetry,
616 remote sensing spatial information sciences 41, 157–162 (2016).
617 51. Strobl, P. The new copernicus digital elevation model. GSICS Q. 14, 17–18 (2020).
618 52. Yamazaki, D. et al. Merit dem: A new high-accuracy global digital elevation model and its merit to global hydrodynamic modeling. In
619 AGU fall meeting abstracts, vol. 2017 (2017).
620 53. Nelson, A. et al. A suite of global accessibility indicators. Sci. data 6, 266 (2019).
621 54. Pickens, A. H. et al. Mapping and sampling to characterize global inland water dynamics from 1999 to 2018 with full landsat time-series.
622 Remote. Sens. Environ. 243, 111792 (2020).
623 55. Kilibarda, M. et al. Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution. J. Geophys. Res.
624 Atmospheres 119, 2294–2313 (2014).
625 56. Demarchi, L. et al. Recursive feature elimination and random forest classification of natura 2000 grasslands in lowland river valleys of
626 poland based on airborne hyperspectral and lidar data fusion. Remote. Sens. 12, 1842, 10.3390/rs12111842 (2020).
627 57. Jamieson, K. & Talwalkar, A. Non-stochastic best arm identification and hyperparameter optimization. In Artificial intelligence and
628 statistics, 240–248, 10.1109/SDS.2019.00-11 (PMLR, 2016).
629 58. Breiman, L. Random forests. Mach. learning 45, 5–32, 10.1023/A:1010933404324 (2001).
630 59. Friedman, J. H. Greedy function approximation: a gradient boosting machine. Annals statistics 1189–1232, 10.1214/aos/1013203451
631 (2001).
632 60. Zou, J., Han, Y. & So, S.-S. Overview of artificial neural networks. Artif. neural networks: methods applications 14–22, 10.1007/
633 978-1-60327-101-1_2 (2009).
634 61. Shaharum, N. et al. Image classification for mapping oil palm distribution via support vector machine using scikit-learn module. The Int.
635 Arch. Photogramm. Remote. Sens. Spatial Inf. Sci. 42, 133–137, 10.5194/isprs-archives-XLII-4-W9-133-2018 (2018).
636 62. Bonannella, C. et al. Forest tree species distribution for europe 2000–2020: mapping potential and realized distributions using

26/28
637 spatiotemporal machine learning. PeerJ 10, e13728, 10.7717/peerj.13728 (2022).
638 63. Ebrahimy, H., Mirbagheri, B., Matkan, A. A. & Azadbakht, M. Effectiveness of the integration of data balancing techniques and
639 tree-based ensemble machine learning algorithms for spatially-explicit land cover accuracy prediction. Remote. Sens. Appl. Soc. Environ.
640 27, 100785, 10.1016/j.rsase.2022.100785 (2022).
641 64. Roberts, D. R. et al. Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography 40,
642 913–929, 10.1111/ecog.0288 (2017).
643 65. Marconcini, M. et al. Outlining where humans live, the world settlement footprint 2015. Sci. Data 7, 242, 10.1038/s41597-020-00580-5
644 (2020).
645 66. TL2cgen: model compiler for decision trees. https://tl2cgen.readthedocs.io/en/latest/. Accessed: 2024-03-11.
646 67. Shekhar, C. On simplified application of multidimensional savitzky-golay filters and differentiators. In AIP Conference Proceedings, vol.
647 1705, 10.1063/1.4940262 (AIP Publishing, 2016).
648 68. Yoo, A. B., Jette, M. A. & Grondona, M. Slurm: Simple linux utility for resource management. In Workshop on job scheduling strategies
649 for parallel processing, 44–60 (Springer, 2003).
650 69. Boettiger, C. An introduction to docker for reproducible research. ACM SIGOPS Oper. Syst. Rev. 49, 71–79, 0.1145/2723872.2723882
651 (2015).
652 70. King, R. D., Orhobor, O. I. & Taylor, C. C. Cross-validation is safe to use. Nat. Mach. Intell. 3, 276–276, 10.1038/s42256-021-00332-z
653 (2021).
654 71. Stehman, S. V. & Foody, G. M. Key issues in rigorous accuracy assessment of land cover products. Remote. Sens. Environ. 231, 111199,
655 10.1016/j.rse.2019.05.018 (2019).
656 72. Fritz, S. et al. Geo-wiki: An online platform for improving global land cover. Environ. Model. & Softw. 31, 110–123, 10.1016/j.envsoft.
657 2011.11.015 (2012).
658 73. Zalles, V. et al. Rapid expansion of human impact on natural land in south america since 1985. Sci. Adv. 7, eabg1620, 10.1126/sciadv.
659 abg1620 (2021).
660 74. Creutzig, F. et al. Assessing human and environmental pressures of global land-use change 2000–2010. Glob. Sustain. 2, e1 (2019).
661 75. Pérez-Hoyos, A., Udías, A. & Rembold, F. Integrating multiple land cover maps through a multi-criteria analysis to improve agricultural
662 monitoring in africa. Int. J. Appl. Earth Obs. Geoinformation 88, 102064, 10.1016/j.jag.2020.102064 (2020).
663 76. Blickensdörfer, L. et al. Mapping of crop types and crop sequences with combined time series of sentinel-1, sentinel-2 and landsat 8 data
664 for germany. Remote. sensing environment 269, 112831, 10.1016/j.rse.2021.112831 (2022).
665 77. Kriebel, D. et al. The precautionary principle in environmental science. Environ. health perspectives 109, 871–876, 10.1289/ehp.0110987
666 (2001).
667 78. Murashkin, D., Spreen, G., Huntemann, M. & Dierking, W. Method for detection of leads from sentinel-1 sar images. Annals Glaciol.
668 59, 124–136, 10.1017/aog.2018.6 (2018).
669 79. Witjes, M., Herold, M. & de Bruin, S. Iterative Mapping of Probabilities (IMP): A data fusion framework for generating accurate land
670 cover maps that match area statistics. J. Appl. Earth Obs. Geoinformation 10.21203/rs.3.rs-3481177/v1 (2024). Accepted for publication.
671 80. Gilbert, M. et al. Global distribution data for cattle, buffaloes, horses, sheep, goats, pigs, chickens and ducks in 2010. Sci. data 5, 1–11,
672 10.1038/sdata.2018.227 (2018).
673 81. Saah, D. et al. Primitives as building blocks for constructing land cover maps. Int. J. Appl. Earth Obs. Geoinformation 85, 101979,
674 10.1016/j.jag.2019.101979 (2020).
675 82. Arevalo, P. et al. Global land cover mapping and estimation yearly 30 m V001 (Distributed by NASA EOSDIS Land Processes DAAC,
676 2022).
677 83. Zhang, X. et al. Glc_fcs30: global land-cover product with fine classification system at 30 m using time-series landsat imagery. Earth
678 Syst. Sci. Data 13, 2753–2776, 10.5194/essd-13-2753-2021 (2021).
679 84. Potapov, P. et al. Global maps of cropland extent and change show accelerated cropland expansion in the twenty-first century. Nat. Food

27/28
680 3, 19–28, 10.1038/s43016-021-00429-z (2022).
681 85. Mancino, G., Falciano, A., Console, R. & Trivigno, M. L. Comparison between parametric and non-parametric supervised land cover
682 classifications of sentinel-2 msi and landsat-8 oli data. Geographies 3, 82–109, 10.3390/geographies3010005 (2023).

683 Acknowledgements
684 This research was supported by a grant to the Land & Carbon Lab from the Bezos Earth Fund. CM acknowledges support through the Senior
685 Scientist program of iDiv, funded by the German Research Foundation (DFG–FZT 118, 202548816).

686 Author contributions statement


687 L.P. was the primary author and together with L.S., T.H., I.W., L.F., S.F., F.S. conceived, designed and coordinated the implementation of the
688 mapping framework. L.P., D.C. implemented the EO data pre-processing, model training, predictive modeling and data publication. V.M.,
689 N.T., M.H., L.F., A.P.M., B.O. performed the reference data collection and the harmonization of existing reference samples. L.P., L.S., R.S.,
690 M.S., S.E., C.M. performed visual assessment and technical validation of the results. L.P., T.H., M.S. prepared data visualization. L.P., L.S.,
691 R.S., T.H., C.B., N.T., I.W., M.H., S.F., C.M., M.W., S.E., Z.M. contributed with writing. All authors reviewed the manuscript.

692 Competing interests


693 The authors declare no competing interests.

28/28
Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.

parenteetal2024sm.pdf

You might also like