1. Introduction
Forecasting the state of forest ecosystems is necessary to manage forest resources and determine strategies for their use [
1,
2]. Models allow us to construct different scenarios for forest management, simulate planned impacts, obtain forecasts for development, and provide estimates of the long-term environmental and economic consequences of all forest management options [
3,
4]. The resulting estimates prevent undesirable changes in ecosystems, such as loss of forest area, loss of biodiversity, and degradation of soil quality.
For areas the size of a district and a region, forest landscape models are used. They cover a spatial scale of 100–10,000 km
2 in a time range from several decades to hundreds of years. By type, landscape models are divided into two main categories—phenomenological and process. The phenomenological ones are based on empirical data and experimental material that estimates the transition probabilities for forest inventory and management planning [
5,
6]. The disadvantage of this group is the dependence of the result on the stability of the initial conditions, while there is no possibility to consider possible changes in climate and modeling territory.
Process models describe the ecosystem structure and the mechanisms underlying its functioning. They consider the interaction of the main factors (for example, soil, climate, light) and processes (species competition, growth, anthropogenic impact, photosynthetic respiration, community succession), which makes it possible to simulate the forest landscapes dynamics in time and space [
7,
8,
9]. Process models produce the details of the main components and processes of ecosystems, making it possible to assess their ecological development for the coming decades and to predict the impact of various types of disturbances, the implementation of management decisions, and climate change on the forest [
10]. Accounting for a variety of factors leads to the evaluation of a large number of scenarios, which ultimately gives the most reliable forecasts of the forest state.
For most forest landscape models, the initial input data are vegetation maps containing information about certain characteristics of trees growing in the study area (species, age, biomass, etc.). One of the most widely used FLM models is the LANDIS-II (Landscape Disturbance and Succession Model) [
11,
12,
13,
14,
15], which was developed to simulate changes in forest ecosystems based on disturbance and succession processes. The model takes into account various types of disturbances, such as fires, wind, and diseases, as well as succession processes such as tree growth and competition between them. The initial dataset for LANDIS-II depends on the succession extension used—in the simplest version, this is information about tree species and age classes (Age-Only Succession extension), maximum above-ground biomass and net productivity, climate data (Biomass Succession extension), respiration and photosynthesis parameters (PnET Succession), soils, nitrogen, and carbon parameters (NECN succession).
The LandClim model [
16,
17] simulates forest landscapes over a time from decades to millennia. The base entity in LandClim—a cohort of trees—is always associated with a separate grid cell. The initial data are the species and age of the trees, climate data, and geographical characteristics, e.g., height, slope, and aspect. The iLand model considers the effects of climate change, competition between trees and other plants, and the impact of anthropogenic activities. The basic entity is individual trees; at their level, the processes of growth, mortality, and competition are modeled. In the TreeMig model [
18], geographical and climate parameters, the number of trees, and their height classes for different species are specified for the grid cells. It is believed that trees are randomly distributed within the cell, as well as their density and light.
Thus, the most commonly used FLMs require information about tree species and age classes. Collecting such information of acceptable accuracy over large areas is a complex task, often requiring a combination of data from different sources [
19]. With the increase in the study area size, it becomes more difficult to collect field data, and sometimes inaccessible due to the geographical characteristics and low transport accessibility of the territory. At the same time, the existing data of the National Forest Inventory are not always available to researchers in full format.
The study shows FLM initialization with the LANDIS-II model. LANDIS-II simulates forest landscapes larger than 100 ha. It includes a wide library of extensions for simulating various ecosystem processes at the plantation and landscape level (succession, fires, wind, felling, tree diseases), as well as the main module that controls the interaction between extensions [
11,
12,
20,
21]. In LANDIS-II, the study area is divided into a grid of interacting sites; in each site, trees of a certain species and age class grow. Each age class of one species in the model represents a cohort. Various processes (succession, anthropogenic impacts) take place between the sites. At the same time, each cohort competes for resources (light, soil moisture, space) among different species in the same cell. Within sites, stand-level forest processes occur, while landscape-level processes, such as tree seed dispersal and disturbance, typically affect several neighboring sites. Moreover, the landscape can be divided into several ecoregions, each of which combines cells with similar ecological conditions that affect the forest state. To take into account such influence, parameters of ecoregions for individual species—for example, the ability to acclimatize in a given region—are added to extensions.
Information about tree species can be obtained from land cover classification products from satellite images. At present, several land cover classifications have been developed, the set of classes of which includes different species or groups of tree species (coniferous, broad-leaved, etc.) [
22,
23,
24]. It is also possible to obtain information about the tree age from remote sensing data. The age of a certain species is directly related to its biomass, this correspondence can be found in regional reference materials (yield tables). One source of the amount of biomass data is maps of above-ground biomass (AGB). AGB is the living vegetation above the ground, including stem, stump, twigs, bark, seeds, and foliage, expressed as mass per unit area.
When quantifying biomass, forest properties are often characterized by three types of remote sensing data [
25,
26]: passive optical spectral reflections are sensitive to vegetation structure (leaf area index, crown size, and tree density), texture, and shade; radar data measures the dielectric and geometric properties of forests; lidar data characterize the vertical structure and height of vegetation. Different types of data have their own advantages and disadvantages in depicting forest properties, so methods of combining data from several sensors are often used to achieve a higher accuracy of biomass estimation. Empirical regression models, non-parametric methods, and physically based allometric models are used to determine the correspondence between remote sensing data and forest biomass indicators. Moreover, FLM usually requires data on climate and soils with sufficient expansion to initialize the parameters of the regions of the study area. Such data can be obtained from open databases [
27,
28].
Therefore, one source for FLM initialization are open remotely sensed-based databases. The advantage of using them is to increase the speed and reduce the cost of preparing the input data for the models. The development of satellite instruments makes it possible to obtain regular information about the state of various ecosystems anywhere in the world, including vegetation. The purpose of this paper is to describe the available data sources for FLM initialization, compare their capabilities, and present a step-by-step process for collecting and integrating information to map the initial state of the forest landscape. Achieving this goal will facilitate the use of FLM to assess the state of the forest in the future, even in areas where inventory data are not available to researchers.
4. Discussion
Our results demonstrate that the data needed to initialize FLM can be collected from open sources. Many studies [
14,
36,
37,
38,
39] use the available forest inventory data (for example, national or regional cadasters), which contain detailed information about forest types, tree age, diameter, height, and disturbances. In the absence of such data, FLM initialization becomes a difficult task.
The emergence in recent years of global land cover classifications with a resolution of 10–30 m and classes of different tree species allows us to come closer to solving this problem. Using such classifications as the basis for combining the initial map for FLM is based on simplification—instead of specific tree species, their groups are taken (coniferous, deciduous, etc.). If the region of interest does not contain many forest-forming species and each group has a dominant species, the application of the proposed approach will be justified. Moreover, groups of species can be used to initialize models when it is necessary to trace the general trends in forest dynamics in the territory.
With the development of machine learning methods and the improvement of remote sensing data quality, it became possible to classify areas occupied by specific tree species on satellite images [
40,
41,
42,
43,
44]. Here, it is possible to separate different species with 95%–97% accuracy in high and ultra-high resolution images, which makes it possible to build detailed maps of forest species composition based on classification materials. If a training sample is available, this approach makes it possible to conduct an automated forest inventory. The authors of the study are also working in this direction, based on the use of a unique regional set of classes and neural networks [
45].
Of course, the use of remote sensing data to create initial FLM maps needs further verification. Unlike field data, satellite images are subject to factors that can significantly distort them. Clouds, shadows, aerosols, and light levels perturb the data, which cannot always be smoothed out by correction. Using alternative data leads to a simplification of the study object, which should affect the accuracy of the result. For example, we linked the biomass and age of the trees of the “coniferous forest” group with data on the productivity of a pine stand of a certain density. In reality, over large areas, a forest is rarely homogeneous.
Using the land classifications and AGB maps is limited by the release date—such materials take a long time to produce, so finding up-to-date data for the current or previous year can be a difficult or even impossible task. When preparing initial data from various sources, it is necessary to pay attention to the year of their production—forest parameters change regularly because of natural growth, felling, and disturbance, so a difference in data collection of one year can sufficiently increase the error. In our study, most of the products used correspond to 2018.
The advantages of the proposed approach are lower cost and higher speed of data preparation for model initialization compared to a ground survey of forests. As the size of the survey area increases from the local level to the level of regional landscapes, the amount of work to collect in situ data increases significantly, so processing information from open databases allows us to increase the efficiency of modeling.
The process of data collection and processing described in the paper made it possible to successfully create the initial map of the species, age classes, and ecoregions, and to calibrate the parameters of the LANDIS-II model. At the same time, only values of areas by tree species and age classes for the whole forestry were available from the inventory data, which were used to calibrate the model parameters. The selected satellite data made it possible to divide the study area into 791 sites and to describe the species–age composition in each of them.
The simulation results for the Goloustnenskye forestry area showed that LANDIS-II provides interesting predictive data. Information about the spatial dynamics of the biomass of various tree species helps us to understand the future state of the territory, and to predict changes in the structure and functioning of forests in response to climatic, anthropogenic, and other factors.