Predicting and Mapping of Soil Organic Carbon Using Machine Learning Algorithms in Northern Iran
:1. Introduction
2. Materials and Methods
2.1. Study Area
2.2. Soil Data
2.3. Auxiliary Variables
2.4. Selection of Auxiliary Variables Using Genetic Algorithms (GA)
2.5. Machine Learning Techniques
2.5.1. Support Vector Machines (SVMs)
2.5.2. Artificial Neural Networks
2.5.3. Regression Tree (Cubist)
2.5.4. Random Forest (RF)
2.5.5. Extreme Gradient Boosting (XGBoost)
2.5.6. Deep Neural Networks (DNN)
2.6. Evaluation of Algorithm Performance
2.7. Uncertainty Assessment
3. Results and Discussions
3.1. Summary Statistics
3.2. Selected Auxiliary Data
3.3. Machine Learning Performances
3.4. Spatial Prediction of SOC with Uncertainty Estimates
3.5. SOC Contents in Soil Classes and Geological Eras
3.6. SOC Contents in Landform Units and Land Uses
4. Conclusions
Author Contributions
Conflicts of Interest
Appendix A
No. | Covariates | Definition |
1 | Aspect | The compass direction of the maximum rate of change |
2 | Catchment Slope | Average gradient above flow path |
3 | Channel networks base level | The interpolated channel network base level elevations |
4 | Convergence Index | It calculates an index of convergence/divergence regarding to overland flow |
5 | Cross-Sectional Curvature | The surface normal and a tangent to the contour—perpendicular to maximum gradient direction |
6 | Diffuse Insolation | Calculate the diffuse incoming solar radiation |
7 | Direct Insolation | Calculate the direct incoming solar radiation |
8 | Downslope Curvature | Calculates the local curvature of a cell as sum of the gradients to its neighbor cells |
9 | Elevation | Height above sea level (m) |
10 | Flow Accumulation | Calculates accumulated flow |
11 | Flow Path Length | The distance from any point in the watershed to the watershed outlet |
12 | Local Curvature | The degree to which a curve deviates from a straight line |
13 | Mass Balance Index | Balance between soil mass deposited and eroded |
14 | Multiresolution Ridge-top Flatness Index | Measure of flatness and lowness |
15 | Multiresolution Valley Bottom Flatness Index | Measure of flatness and lowness |
16 | Normalized Height | Normalized height is defined by slope height and valley depth |
17 | Openness (NegOpen) | How wide a landscape can be viewed from any position |
18 | Openness (PosOpen) | How wide a landscape can be viewed from any position |
19 | Plan curvature | The curvature of a contour line formed by intersecting a horizontal plane with the surface |
20 | Relative Slope Position | The position of one point relative to the ridge and valley of a slope |
21 | Slope Gradient | Average gradient above flow path |
22 | Slope Length | Calculate the length of slope |
23 | Slope Length factor | Slope Length and Steepness factor |
24 | Topographic Wetness index | Ln (FA/SG) |
25 | Total Insolation | Calculate the total incoming solar radiation |
26 | Upslope Curvature | The distance weighted average local curvature in a cell’s upslope contributing area |
27 | Valley Depth | The vertical distance to a channel network base level |
28 | Vector Terrain Ruggedness | Measures terrain ruggedness |
29 | Vertical distance to channel networks | The altitude above the channel network |
30 | Wind Effect | The Wind Effect is a dimensionless index |
31 | Blue | Wavelength of 0.450–0.515 μm of Landsat 8 spectral band |
32 | Green | Wavelength of 0.525–0.600 μm of Landsat 8 spectral band |
33 | Red | Wavelength of 0.630–0.680 μm of Landsat 8 spectral band |
34 | Near infrared | Wavelength of 0.845–0.885 μm of Landsat 8 spectral band |
35 | Shortwave infrared-1 | Wavelength of 1.560–1.660 μm of Landsat 8 spectral band |
36 | Shortwave infrared-2 | Wavelength of 2.100–2.300 μm of Landsat 8 spectral band |
37 | Principal Component 1 | The first principal component of Landsat 8 spectral band |
38 | Principal Component 2 | The second principal component of Landsat 8 spectral band |
39 | Principal Component 3 | The third principal component of Landsat 8 spectral band |
40 | TASSELED CAP 1 | The overall brightness of the image |
41 | TASSELED CAP 2 | The overall greenness of the image |
42 | TASSELED CAP 3 | The overall wetness of the image |
43 | Wetness brightness difference index | TASSELED CAP 3/TASSELED CAP 1 |
44 | Atmospherically Resistant Vegetation Index | (−0.18 + 1.17 (NIR − RED/NIR + RED)) |
45 | Blue-Wide Dynamic Range Vegetation Index | (0.1 × NIR − BLUE)/(0.1 × NIR + BLUE) |
46 | Brightness Index | ((RED)2 + (NIR)2)0.5 |
47 | Canopy Index | (SWIR-1-GREEN) |
48 | Carbonate Index | (RED/GREEN) |
49 | Chlorophyll vegetation index | (NIR × RED/(GREEN)0.5 |
50 | Clay Index | (SWIR-1/SWIR-2) |
51 | Coloration Index | (RED − GREEN/RED + GREEN) |
52 | Differenced Vegetation Index | (NIR − RED) |
53 | Enhanced Vegetation Index | (NIR − RED)/(NIR + C1 × RED − C2 × BLUE + L) |
54 | Ferrous Minerals | (SWIR-1/NIR) |
55 | Green Atmospherically Resistant Vegetation Index | (NIR − (GREEN − (BLUE − RED))/(NIR − (GREEN + (BLUE − RED)) |
56 | Green Leaf Index | (2 × GREEN − RED − BLUE)/(2 × GREEN + RED + BLUE) |
57 | Green Normalized Difference Vegetation Index | (NIR − GREEN/NIR+ GREEN) |
58 | Green Vegetation Index | (0.29 × GREEN − 0.56 × RED + 0.6 × SWIR-1 + 0.49 × GREEN) |
59 | Green-Blue NDVI | (NIR − (GREEN + BLUE)/NIR + (GREEN + BLUE)) |
60 | Green-Red Vegetation Index | (GREEN − RED) |
61 | Gypsum index | (SWIR-1 − NIR)/(SWIR-1 + NIR) |
62 | Hue Index | (2 × (RED − GREEN − BLUE))/(GREEN − BLUE) |
63 | Infrared Percentage Vegetation Index | (NIR/(NIR+RED)) |
64 | Iron Oxide | (RED/BLUE) |
65 | Leaf Water Content | (SWIR-1/SWIR-2) |
66 | Modified Soil Adjusted Vegetation Index | (0.5 × ((2 × (NIR + 1)) − (((2 × NIR) + 1)2 − 8 × (NIR − RED))0.5)) |
67 | Near Infrared Ratio | (NIR/RED) |
68 | Norm GREEN | (GREEN/(NIR + RED + GREEN)) |
69 | Norm NIR | (NIR/(NIR + RED + GREEN)) |
70 | Norm RED | (RED/(NIR + RED + GREEN)) |
71 | Normalized Based | ((NIR − (BLUE + GREEN)/(NIR + (BLUE + GREEN))) |
72 | Normalized Canopy Index | (SWIR-1 − GREEN/SWIR-1 + GREEN) |
73 | Normalized Difference Moisture Index | (NIR − SWIR-1)/(NIR + SWIR-1) |
74 | Normalized Difference Salinity Index | (RED − NIR)/(RED + NIR) |
75 | Normalized Difference Vegetation Index | (NIR − RED)/(NIR + RED) |
76 | Perpendicular Vegetation Index | (NIR − r) cos µ − RED × sin µ |
77 | Ratio Vegetation Index | (NIR/RED)/(GREEN + RED) |
78 | Redness Index | (RED^2/BLUE × GREEN) |
79 | Reflectance Absorption Index | (NIR/(RED + SWIR-1)) |
80 | Renormalized difference Vegetation Index | (NIR − RED)/((NIR + RED) ^ 1/2) |
81 | MODIS Red | Wavelength of 0.620–0.670 μm of MODIS spectral band |
82 | MODIS Near Infrared | Wavelength of 0.841–0.876 μm of MODIS spectral band |
83 | MODIS Night Temperature | Land Surface Temperature/Emissivity Daily L3 Global 1 km |
84 | MODIS Day Temperature | Land Surface Temperature/Emissivity Daily L3 Global 1 km |
85 | MODIS Normalized Difference Vegetation Index | (MODIS NIR − MODIS RED)/(MODIS NIR + MODIS RED) |
86 | MODIS Brightness Index | ((MODIS RED)2 + (MODIS NIR)2)0.5 |
87 | Soil Adjusted Vegetation Index | (1+ L) × (NIR − RED)/(NIR + RED + L) |
88 | Specific Leaf Area Vegetation Index | (NIR/RED + SWIR-1) |
89 | Stress Related | ((BLUE× GREEN)/RED) |
90 | Vegetation Index | (SWIR-2 − SWIR-1/SWIR-2 + SWIR-1) |
91 | Annual Precipitation | It is derived from the monthly rainfall values |
92 | Precipitation Seasonality (Coefficient of Variation) | It is derived from the monthly rainfall values |
93 | Precipitation of Wettest Month | It is derived from the monthly rainfall values |
94 | Precipitation of Driest Month | It is derived from the monthly rainfall values |
95 | Mean Annual Temperature | It is derived from the monthly temperature values |
96 | Mean Annual Wind Speed | It is derived from the monthly wind speed values |
97 | Mean Annual Water Vapor Pressure | It is derived from the monthly water vapor pressure values |
98 | Mean Annual Actual Evapo-Transpiration | It is derived from the monthly actual evapo-transpiration values |
99 | Mean Annual Potential Evapo-Transpiration | It is derived from the monthly potential evapo-transpiration values |
100 | Global Aridity Index | It shows the rainfall deficit for potential vegetative growth |
101 | Soil Map | Soil and Water Research Institute of Iran |
102 | Geology Map | Soil and Water Research Institute of Iran |
103 | Land Use Map | Soil and Water Research Institute of Iran |
104 | Physiography Map | Soil and Water Research Institute of Iran |
105 | Erosion Classes Map | Soil and Water Research Institute of Iran |
- Edenhofer, O.; Pichs-Madruga, R.; Sokona, Y.; Seyboth, K.; Kadner, S.; Zwickel, T.; Eickemeier, P.; Hansen, G.; Schlömer, S.; von Stechow, C. Renewable Energy Sources and Climate Change Mitigation: Special Report of the Intergovernmental Panel on Climate Change; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
- Adhikari, K.; Hartemink, A.E. Digital Mapping of Topsoil Carbon Content and Changes in the Driftless Area of Wisconsin, USA. Soil Sci. Soc. Am. J. 2015, 79, 155–164. [Google Scholar] [CrossRef] [Green Version]
- Lal, R. Soil carbon sequestration to mitigate climate change. Geoderma 2004, 123, 1–22. [Google Scholar] [CrossRef]
- Minasny, B.; McBratney, A.B.; Malone, B.P.; Wheeler, I. Digital mapping of soil carbon. In Advances in Agronomy; Elsevier: Amsterdam, The Netherlands, 2013; Volume 118, pp. 1–47. [Google Scholar]
- Yang, R.-M.; Zhang, G.-L.; Liu, F.; Lu, Y.-Y.; Yang, F.; Yang, F.; Yang, M.; Zhao, Y.-G.; Li, D.-C. Comparison of boosted regression tree and random forest models for mapping topsoil organic carbon concentration in an alpine ecosystem. Ecol. Indic. 2016, 60, 870–878. [Google Scholar] [CrossRef]
- Emadi, M.; Baghernejad, M.; Bahmanyar, M.A.; Morovvat, A. Changes in soil inorganic phosphorous pools along a precipitation gradient in northern Iran. Int. J. For. Soil Eros. 2012, 2, 143–147. [Google Scholar]
- Ogle, S.M.; Paustian, K. Soil organic carbon as an indicator of environmental quality at the national scale: Inventory monitoring methods and policy relevance. Can. J. Soil Sci. 2005, 85, 531–540. [Google Scholar] [CrossRef]
- Jenny, H. Factors of Soil Formation: A System of Quantitative Pedology; Courier Corporation: North Chelmsford, MA, USA, 1994. [Google Scholar]
- Somarathna, P.; Minasny, B.; Malone, B.P. More data or a better model? Figuring out what matters most for the spatial prediction of soil carbon. Soil Sci. Soc. Am. J. 2017, 81, 1413–1426. [Google Scholar] [CrossRef]
- Liakos, K.G.; Busato, P.; Moshou, D.; Pearson, S.; Bochtis, D. Machine learning in agriculture: A review. Sensors 2018, 18, 2674. [Google Scholar] [CrossRef] [Green Version]
- Padarian, J.; Minasny, B.; McBratney, A.B. Using deep learning for digital soil mapping: A review aided by machine learning tools. Soil 2019, 5, 79–89. [Google Scholar] [CrossRef] [Green Version]
- Mahmoudzadeh, H.; Matinfar, H.R.; Taghizadeh-Mehrjardi, R.; Kerry, R. Spatial prediction of soil organic carbon using machine learning techniques in western Iran. Geoderma Reg. 2020, 21, e00260. [Google Scholar] [CrossRef]
- McBratney, A.B.; Stockmann, U.; Angers, D.A.; Minasny, B.; Field, D.J. Challenges for soil organic carbon research. In Soil Carbon; Springer: Berlin/Heidelberg, Germany, 2014; pp. 3–16. [Google Scholar]
- Lamichhane, S.; Kumar, L.; Wilson, B. Digital soil mapping algorithms and covariates for soil organic carbon mapping and their implications: A review. Geoderma 2019, 352, 395–413. [Google Scholar] [CrossRef]
- Zhang, G.; Feng, L.; Song, X. Recent progress and future prospect of digital soil mapping: A review. J. Integr. Agric. 2017, 16, 2871–2885. [Google Scholar] [CrossRef]
- Wang, B.; Waters, C.; Orgill, S.; Cowie, A.; Clark, A.; Li Liu, D.; Simpson, M.; McGowen, I.; Sides, T. Estimating soil organic carbon stocks using different modelling techniques in the semi-arid rangelands of eastern Australia. Ecol. Indic. 2018, 88, 425–438. [Google Scholar] [CrossRef]
- Xiao, J.; Chevallier, F.; Gomez, C.; Guanter, L.; Hicke, J.A.; Huete, A.R.; Ichii, K.; Ni, W.; Pang, Y.; Rahman, A.F. Remote sensing of the terrestrial carbon cycle: A review of advances over 50 years. Remote Sens. Environ. 2019, 233, 111383. [Google Scholar] [CrossRef]
- Mishra, U.; Lal, R.; Liu, D.; Van Meirvenne, M. Predicting the spatial variation of the soil organic carbon pool at a regional scale. Soil Sci. Soc. Am. J. 2010, 74, 906–914. [Google Scholar] [CrossRef]
- Veronesi, F.; Schillaci, C. Comparison between geostatistical and machine learning models as predictors of topsoil organic carbon with a focus on local uncertainty estimation. Ecol. Indic. 2019, 101, 1032–1044. [Google Scholar] [CrossRef]
- Zhang, H.T.; Gao, M.X. The Application of Support Vector Machine (SVM) Regression Method in Tunnel Fires. Procedia Eng. 2018, 211, 1004–1011. [Google Scholar] [CrossRef]
- Castaldi, F.; Chabrillat, S.; Chartin, C.; Genot, V.; Jones, A.; van Wesemael, B. Estimation of soil organic carbon in arable soil in Belgium and Luxembourg with the LUCAS topsoil database. Eur. J. Soil Sci. 2018, 69, 592–603. [Google Scholar] [CrossRef]
- Malone, B.P.; McBratney, A.; Minasny, B.; Laslett, G. Mapping continuous depth functions of soil carbon storage and available water capacity. Geoderma 2009, 154, 138–152. [Google Scholar] [CrossRef]
- Were, K.; Bui, D.T.; Dick, Ø.B.; Singh, B.R. A comparative assessment of support vector regression, artificial neural networks, and random forests for predicting and mapping soil organic carbon stocks across an Afromontane landscape. Ecol. Indic. 2015, 52, 394–403. [Google Scholar] [CrossRef]
- Zhao, Z.; Yang, Q.; Benoy, G.; Chow, T.L.; Xing, Z.; Rees, H.W.; Meng, F.-R. Using artificial neural network models to produce soil organic carbon content distribution maps across landscapes. Can. J. Soil Sci. 2010, 90, 75–87. [Google Scholar] [CrossRef]
- Taghizadeh-Mehrjardi, R.; Nabiollahi, K.; Kerry, R. Digital mapping of soil organic carbon at multiple depths using different data mining techniques in Baneh region, Iran. Geoderma 2016, 266, 98–110. [Google Scholar] [CrossRef]
- Ballabio, C. Spatial prediction of soil properties in temperate mountain regions using support vector regression. Geoderma 2009, 151, 338–350. [Google Scholar] [CrossRef]
- Rossel, R.V.; Behrens, T. Using data mining to model and interpret soil diffuse reflectance spectra. Geoderma 2010, 158, 46–54. [Google Scholar] [CrossRef]
- Shepherd, K.D.; Walsh, M.G. Development of reflectance spectral libraries for characterization of soil properties. Soil Sci. Soc. Am. J. 2002, 66, 988–998. [Google Scholar] [CrossRef]
- Akpa, S.I.; Odeh, I.O.; Bishop, T.F.; Hartemink, A.E.; Amapu, I.Y. Total soil organic carbon and carbon sequestration potential in Nigeria. Geoderma 2016, 271, 202–215. [Google Scholar] [CrossRef]
- Gray, J.M.; Bishop, T.F.; Wilson, B.R. Factors controlling soil organic carbon stocks with depth in eastern Australia. Soil Sci. Soc. Am. J. 2015, 79, 1741–1751. [Google Scholar] [CrossRef] [Green Version]
- Martin, M.; Wattenbach, M.; Smith, P.; Meersmans, J.; Jolivet, C.; Boulonne, L.; Arrouays, D. Spatial distribution of soil organic carbon stocks in France: Discussion paper. Biogeosci. Discuss. 2010, 7, 8409–8443. [Google Scholar] [CrossRef] [Green Version]
- Wang, B.; Waters, C.; Orgill, S.; Gray, J.; Cowie, A.; Clark, A.; Li Liu, D. High resolution mapping of soil organic carbon stocks using remote sensing variables in the semi-arid rangelands of eastern Australia. Sci. Total Environ. 2018, 630, 367–378. [Google Scholar] [CrossRef]
- Nabiollahi, K.; Eskandari, S.; Taghizadeh-Mehrjardi, R.; Kerry, R.; Triantafalis, J. Assessing soil organic carbon stocks under land-use change scenarios using random forest models. Carbon Manag. 2019, 10, 63–77. [Google Scholar] [CrossRef]
- Zeraatpisheh, M.; Ayoubi, S.; Jafari, A.; Tajik, S.; Finke, P. Digital mapping of soil properties using multiple machine learning in a semi-arid region, central Iran. Geoderma 2019, 338, 445–452. [Google Scholar] [CrossRef]
- Webster, R.; Oliver, M.A. Geostatistics for Environmental Scientists; John Wiley & Sons: Hoboken, NJ, USA, 2007. [Google Scholar]
- Taghizadeh-Mehrjardi, R.; Neupane, R.; Sood, K.; Kumar, S. Artificial bee colony feature selection algorithm combined with machine learning algorithms to predict vertical and lateral distribution of soil organic matter in South Dakota, USA. Carbon Manag. 2017, 8, 277–291. [Google Scholar] [CrossRef]
- Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Salakhutdinov, R.; Tenenbaum, J.B.; Torralba, A. Learning with hierarchical-deep models. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1958–1971. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kamilaris, A.; Prenafeta-Boldú, F.X. Deep learning in agriculture: A survey. Comput. Electron. Agric. 2018, 147, 70–90. [Google Scholar] [CrossRef] [Green Version]
- Song, X.; Zhang, G.; Liu, F.; Li, D.; Zhao, Y.; Yang, J. Modeling spatio-temporal distribution of soil moisture by deep learning-based cellular automata model. J. Arid Land 2016, 8, 734–748. [Google Scholar] [CrossRef] [Green Version]
- Padarian, J.; Minasny, B.; McBratney, A. Using deep learning to predict soil properties from regional spectral data. Geoderma Reg. 2019, 16, e00198. [Google Scholar] [CrossRef]
- Wadoux, A.M.J.C.; Padarian, J.; Minasny, B. Multi-source data integration for soil mapping using deep learning. SOIL 2019, 5, 107–119. [Google Scholar] [CrossRef] [Green Version]
- Xu, Z.; Zhao, X.; Guo, X.; Guo, J. Deep Learning Application for Predicting Soil Organic Matter Content by VIS-NIR Spectroscopy. Comput. Intell. Neurosci. 2019, 2019, 3563761. [Google Scholar] [CrossRef]
- Taghizadeh-Mehrjardi, R.; Schmidt, K.; Amirian-Chakan, A.; Rentschler, T.; Zeraatpisheh, M.; Sarmadian, F.; Valavi, R.; Davatgar, N.; Behrens, T.; Scholten, T. Improving the Spatial Prediction of Soil Organic Carbon Content in Two Contrasting Climatic Regions by Stacking Machine Learning Models and Rescanning Covariate Space. Remote Sens. 2020, 12, 1095. [Google Scholar] [CrossRef] [Green Version]
- Shirani, H.; Habibi, M.; Besalatpour, A.; Esfandiarpour, I. Determining the features influencing physical quality of calcareous soils in a semiarid region of Iran using a hybrid PSO-DT algorithm. Geoderma 2015, 259, 1–11. [Google Scholar] [CrossRef]
- Xie, H.; Zhao, J.; Wang, Q.; Sui, Y.; Wang, J.; Yang, X.; Zhang, X.; Liang, C. Soil type recognition as improved by genetic algorithm-based variable selection using near infrared spectroscopy and partial least squares discriminant analysis. Sci. Rep. 2015, 5, 10930. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Pourmohammadali, B.; Hosseinifard, S.J.; Salehi, M.H.; Shirani, H.; Boroujeni, E. Effects of soil properties, water quality and management practices on pistachio yield in Rafsanjan region, southeast of Iran. Agric. Water Manag. 2019, 213, 894–902. [Google Scholar] [CrossRef]
- Besalatpour, A.A.; Ayoubi, S.; Hajabbasi, M.A.; Jazi, A.Y.; Gharipour, A. Feature Selection Using Parallel Genetic Algorithm for the Prediction of Geometric Mean Diameter of Soil Aggregates by Machine Learning Methods. Arid Land Res. Manag. 2014, 28, 383–394. [Google Scholar] [CrossRef] [Green Version]
- Behrens, T.; Zhu, A.X.; Schmidt, K.; Scholten, T. Multi-scale digital terrain analysis and feature selection for digital soil mapping. Geoderma 2010, 155, 175–185. [Google Scholar] [CrossRef]
- Taghizadeh-mehrjardi, R.; Toomanian, N.; Khavaninzadeh, A.; Jafari, A.; Triantafilis, J. Predicting and mapping of soil particle-size fractions with adaptive neuro-fuzzy inference and ant colony optimization in central I ran. Eur. J. Soil Sci. 2016, 67, 707–725. [Google Scholar] [CrossRef]
- Calixto, W.P.; Martins Neto, L.; Wu, M.; Kliemann, H.J.; de Castro, S.S.; Yamanaka, K. Calculation of soil electrical conductivity using a genetic algorithm. Comput. Electron. Agric. 2010, 71, 1–6. [Google Scholar] [CrossRef]
- Welikala, R.A.; Fraz, M.M.; Dehmeshki, J.; Hoppe, A.; Tah, V.; Mann, S.; Williamson, T.H.; Barman, S.A. Genetic algorithm based feature selection combined with dual classification for the automated detection of proliferative diabetic retinopathy. Comput. Med Imaging Graph. 2015, 43, 64–77. [Google Scholar] [CrossRef] [Green Version]
- Zeraatpisheh, M.; Jafari, A.; Bodaghabadi, M.B.; Ayoubi, S.; Taghizadeh-Mehrjardi, R.; Toomanian, N.; Kerry, R.; Xu, M. Conventional and digital soil mapping in Iran: Past, present, and future. Catena 2020, 188, 104424. [Google Scholar] [CrossRef]
- Guan, J.H.; Deng, L.; Zhang, J.G.; He, Q.Y.; Shi, W.Y.; Li, G.; Du, S. Soil organic carbon density and its driving factors in forest ecosystems across a northwestern province in China. Geoderma 2019, 352, 1–12. [Google Scholar] [CrossRef]
- Cruz-Cárdenas, G.; López-Mata, L.; Ortiz-Solorio, C.A.; Villaseñor, J.L.; Ortiz, E.; Silva, J.T.; Estrada-Godoy, F. Interpolation of mexican soil properties at a scale of 1:1,000,000. Geoderma 2014, 213, 29–35. [Google Scholar] [CrossRef]
- Guo, Z.; Adhikari, K.; Chellasamy, M.; Greve, M.B.; Owens, P.R.; Greve, M.H. Selection of terrain attributes and its scale dependency on soil organic carbon prediction. Geoderma 2019, 340, 303–312. [Google Scholar] [CrossRef]
- Emadi, M.; Shahriari, A.R.; Sadegh-Zadeh, F.; Jalili Seh-Bardan, B.; Dindarlou, A. Geostatistics-based spatial distribution of soil moisture and temperature regime classes in Mazandaran province, northern Iran. Arch. Agron. Soil Sci. 2016, 62, 502–522. [Google Scholar] [CrossRef]
- Emadi, M.; Baghernejad, M.; Memarian, H.R. Effect of land-use change on soil fertility characteristics within water-stable aggregates of two cultivated soils in northern Iran. Land Use Policy 2009, 26, 452–457. [Google Scholar] [CrossRef]
- Zeraatpishe, M.; Khormali, F. Carbon stock and mineral factors controlling soil organic carbon in a climatic gradient, Golestan province. J. Soil Sci. Plant Nutr. 2012, 12, 637–654. [Google Scholar] [CrossRef]
- Darabi, N. Mapping Saline Soils Using GIS and RS Techniques. Master’s Thesis, Sari University of Agricultural Sciences and Natural Resources, Sari, Iran, 2016. [Google Scholar]
- Maldari, M. Testing Performance of Vis-Infrared Spectral Reflectance for Estimation of Soil Properties. Master’s Thesis, Sari University of Agricultural Sciences and Natural Resources, Sari, Iran, 2016. [Google Scholar]
- Masoudi, S. Using Geostatistical and Fuzzy Approaches for Delineation of Soil Management Zone by Soil Properties and Wheat Yield, Northern Iran. Master’s Thesis, Sari University of Agricultural Sciences and Natural Resources, Sari, Iran, 2016. [Google Scholar]
- Sajjadi, F. Spatial Variability of Some Soil Properties in Different Landscape, Northern Iran. Master’s Thesis, Sari University of Agricultural Sciences and Natural Resources, Sari, Iran, 2016. [Google Scholar]
- Sojoodeh, A. Spatial Variability of Some Soil Physical and Chemical Properties and Comparison of Geostatistical Approaches in Soil Mapping. Master’s Thesis, Sari University of Agricultural Sciences and Natural Resources, Sari, Iran, 2015. [Google Scholar]
- Amiri, E. Calibration and testing of the Aquacrop model for rice under water and nitrogen management. Commun. Soil Sci. Plant Anal. 2016, 47, 387–403. [Google Scholar] [CrossRef]
- Sayão, V.M.; Demattê, J.A. Soil texture and organic carbon mapping using surface temperature and reflectance spectra in Southeast Brazil. Geoderma Reg. 2018, 14, e00174. [Google Scholar] [CrossRef]
- Gallant, J.C.; Dowling, T.I. A multi-resolution index of valley bottom flatness for mapping depositional areas. Water Resour. Res. 2003, 39, 1347. [Google Scholar] [CrossRef]
- Fick, S.E.; Hijmans, R.J. WorldClim 2: New 1-km spatial resolution climate surfaces for global land areas. Int. J. Climatol. 2017, 37, 4302–4315. [Google Scholar] [CrossRef]
- Banaei, M.; Moameni, A.; Bybordi, M.; Malakouti, M. The Soils of Iran: New Achievements in Perception, Management and Use; Soil and Water Research Institute: Tehran, Iran, 2005. [Google Scholar]
- Tajik, S.; Zarinkamar, F.; Soltani, B.M.; Nazari, M. Induction of phenolic and flavonoid compounds in leaves of saffron (Crocus sativus L.) by salicylic acid. Sci. Hortic. 2019, 257, 108751. [Google Scholar] [CrossRef]
- Huang, Y.; Lan, Y.; Thomson, S.J.; Fang, A.; Hoffmann, W.C.; Lacey, R.E. Development of soft computing and applications in agricultural and biological engineering. Comput. Electron. Agric. 2010, 71, 107–127. [Google Scholar] [CrossRef] [Green Version]
- Kuhn, M. Building predictive models in R using the caret package. J. Stat. Softw. 2008, 28, 1–26. [Google Scholar] [CrossRef] [Green Version]
- González Costa, J.J.; Reigosa, M.J.; Matías, J.M.; Covelo, E.F. Soil Cd, Cr, Cu, Ni, Pb and Zn sorption and retention models using SVM: Variable selection and competitive model. Sci. Total Environ. 2017, 593–594, 508–522. [Google Scholar] [CrossRef] [PubMed]
- Abrougui, K.; Gabsi, K.; Mercatoris, B.; Khemis, C.; Amami, R.; Chehaibi, S. Prediction of organic potato yield using tillage systems and soil properties by artificial neural network (ANN) and multiple linear regressions (MLR). Soil Tillage Res. 2019, 190, 202–208. [Google Scholar] [CrossRef]
- Ochoa-Martínez, C.I.; Ayala-Aponte, A.A. prediction of mass transfer kinetics during osmotic dehydration of apples using neural networks. LWT Food Sci. Technol. 2007, 40, 638–645. [Google Scholar] [CrossRef]
- Trigui, M.; Gabsi, K.; Amri, I.E.; Helal, A.N.; Barrington, S. Modular Feed Forward Networks to Predict Sugar Diffusivity from Date Pulp Part I. Model Validation. Int. J. Food Prop. 2011, 14, 356–370. [Google Scholar] [CrossRef]
- Fernandes, M.M.H.; Coelho, A.P.; Fernandes, C.; da Silva, M.F.; Dela Marta, C.C. Estimation of soil organic matter content by modeling with artificial neural networks. Geoderma 2019, 350, 46–51. [Google Scholar] [CrossRef]
- Yilmaz, I.; Kaynar, O. Multiple regression, ANN (RBF, MLP) and ANFIS models for prediction of swell potential of clayey soils. Expert Syst. Appl. 2011, 38, 5958–5966. [Google Scholar] [CrossRef]
- Candel, A.; Parmar, V.; LeDell, E.; Arora, A. Deep Learning with H2O; Inc.: Mountain View, CA, USA, 2016. [Google Scholar]
- Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees; Wadsworth International Group: Belmont, CA, USA, 1984; Volume 432, pp. 151–166. [Google Scholar]
- Mikkonen, H.G.; van de Graaff, R.; Clarke, B.O.; Dasika, R.; Wallis, C.J.; Reichman, S.M. Geochemical indices and regression tree models for estimation of ambient background concentrations of copper, chromium, nickel and zinc in soil. Chemosphere 2018, 210, 193–203. [Google Scholar] [CrossRef]
- Malone, B.P.; Styc, Q.; Minasny, B.; McBratney, A.B. Digital soil mapping of soil carbon at the farm scale: A spatial downscaling approach in consideration of measured and uncertain data. Geoderma 2017, 290, 91–99. [Google Scholar] [CrossRef]
- Appelhans, T.; Mwangomo, E.; Hardy, D.R.; Hemp, A.; Nauss, T. Evaluating machine learning approaches for the interpolation of monthly air temperature at Mt. Kilimanjaro, Tanzania. Spat. Stat. 2015, 14, 91–113. [Google Scholar] [CrossRef] [Green Version]
- Kuhn, M.; Weston, S.; Keefer, C.; Coulter, N. Cubist models for regression. R Package Vignette R Package Version 0.0 2012, 18, 223–244. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Fu, B.; Wang, Y.; Campbell, A.; Li, Y.; Zhang, B.; Yin, S.; Xing, Z.; Jin, X. Comparison of object-based and pixel-based Random Forest algorithm for wetland vegetation mapping using high spatial resolution GF-1 and SAR data. Ecol. Indic. 2017, 73, 105–117. [Google Scholar] [CrossRef]
- Houborg, R.; McCabe, M.F. A hybrid training approach for leaf area index estimation via Cubist and random forests machine-learning. ISPRS J. Photogramm. Remote Sens. 2018, 135, 173–188. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
- Fan, J.; Wang, X.; Wu, L.; Zhou, H.; Zhang, F.; Yu, X.; Lu, X.; Xiang, Y. Comparison of Support Vector Machine and Extreme Gradient Boosting for predicting daily global solar radiation using temperature and precipitation in humid subtropical climates: A case study in China. Energy Convers. Manag. 2018, 164, 102–111. [Google Scholar] [CrossRef]
- Li, W.; Fu, H.; Yu, L.; Gong, P.; Feng, D.; Li, C.; Clinton, N. Stacked Autoencoder-based deep learning for remote-sensing image classification: A case study of African land-cover mapping. Int. J. Remote Sens. 2016, 37, 5632–5646. [Google Scholar] [CrossRef]
- Sa, I.; Popović, M.; Khanna, R.; Chen, Z.; Lottes, P.; Liebisch, F.; Nieto, J.; Stachniss, C.; Walter, A.; Siegwart, R. Weedmap: A large-scale semantic weed mapping framework using aerial multispectral imaging and deep neural network for precision farming. Remote Sens. 2018, 10, 1423. [Google Scholar] [CrossRef] [Green Version]
- Emadi, M.; Baghernejad, M.; Emadi, M.; Maftoun, M. Assessment of some soil properties by spatial variability in saline and sodic soils in Arsanjan plain, Southern Iran. Pak. J. Biol. Sci. 2008, 11, 238–243. [Google Scholar] [CrossRef] [Green Version]
- Wang, Y.; Zhang, Z.; Feng, L.; Du, Q.; Runge, T. Combining Multi-Source Data and Machine Learning Approaches to Predict Winter Wheat Yield in the Conterminous United States. Remote Sens. 2020, 12, 1232. [Google Scholar] [CrossRef] [Green Version]
- Wang, N.; Zhang, D.; Chang, H.; Li, H. Deep learning of subsurface flow via theory-guided neural network. J. Hydrol. 2020, 584, 124700. [Google Scholar] [CrossRef] [Green Version]
- Floody, M.C.; Theng, B.; Reyes, P.; Mora, M. Natural nanoclays: Applications and future trends—A Chilean perspective. Clay Miner. 2009, 44, 161–176. [Google Scholar] [CrossRef]
- Mitsa, T. How Do You Know You Have Enough Training Data? 2019. Available online: (accessed on 6 June 2020).
- Zhu, X.; Vondrick, C.; Fowlkes, C.C.; Ramanan, D. Do we need more training data? Int. J. Comput. Vis. 2016, 119, 76–92. [Google Scholar] [CrossRef] [Green Version]
- Nagelkerke, N.J. A note on a general definition of the coefficient of determination. Biometrika 1991, 78, 691–692. [Google Scholar] [CrossRef]
- Nickerson, C.A. A note on “A concordance correlation coefficient to evaluate reproducibility”. Biometrics 1997, 53, 1503–1507. [Google Scholar] [CrossRef] [Green Version]
- Willmott, C.J.; Matsuura, K. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim. Res. 2005, 30, 79–82. [Google Scholar] [CrossRef]
- Minasny, B.; Setiawan, B.I.; Arif, C.; Saptomo, S.K.; Chadirin, Y. Digital mapping for cost-effective and accurate prediction of the depth and carbon stocks in Indonesian peatlands. Geoderma 2016, 272, 20–31. [Google Scholar]
- Griffiths, R.P.; Madritch, M.D.; Swanson, A.K. The effects of topography on forest soil characteristics in the Oregon Cascade Mountains (USA): Implications for the effects of climate change on soil properties. For. Ecol. Manag. 2009, 257, 1–7. [Google Scholar] [CrossRef]
- Ma, M.; Chang, R. Temperature drive the altitudinal change in soil carbon and nitrogen of montane forests: Implication for global warming. Catena 2019, 182, 104126. [Google Scholar] [CrossRef]
- Falahatkar, S.; Hosseini, S.M.; Ayoubi, S.; Salmanmahiny, A. Predicting soil organic carbon density using auxiliary environmental variables in northern Iran. Arch. Agron. Soil Sci. 2016, 62, 375–393. [Google Scholar] [CrossRef]
- Xiong, X.; Grunwald, S.; Myers, D.B.; Kim, J.; Harris, W.G.; Bliznyuk, N. Assessing uncertainty in soil organic carbon modeling across a highly heterogeneous landscape. Geoderma 2015, 251–252, 105–116. [Google Scholar] [CrossRef]
- Nabiollahi, K.; Taghizadeh-Mehrjardi, R.; Eskandari, S. Assessing and monitoring the soil quality of forested and agricultural areas using soil-quality indices and digital soil-mapping in a semi-arid environment. Arch. Agron. Soil Sci. 2018, 64, 696–707. [Google Scholar] [CrossRef]
- Matsushita, B.; Yang, W.; Chen, J.; Onda, Y.; Qiu, G. Sensitivity of the enhanced vegetation index (EVI) and normalized difference vegetation index (NDVI) to topographic effects: A case study in high-density cypress forest. Sensors 2007, 7, 2636–2651. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Dai, F.; Zhou, Q.; Lv, Z.; Wang, X.; Liu, G. Spatial prediction of soil organic matter content integrating artificial neural network and ordinary kriging in Tibetan Plateau. Ecol. Indic. 2014, 45, 184–194. [Google Scholar] [CrossRef]
- Pei, T.; Qin, C.-Z.; Zhu, A.X.; Yang, L.; Luo, M.; Li, B.; Zhou, C. Mapping soil organic matter using the topographic wetness index: A comparative study based on different flow-direction algorithms and kriging methods. Ecol. Indic. 2010, 10, 610–619. [Google Scholar] [CrossRef]
- Schillaci, C.; Lombardo, L.; Saia, S.; Fantappiè, M.; Märker, M.; Acutis, M. Modelling the topsoil carbon stock of agricultural lands with the Stochastic Gradient Treeboost in a semi-arid Mediterranean region. Geoderma 2017, 286, 35–45. [Google Scholar] [CrossRef]
- Stevens, A.; van Wesemael, B.; Bartholomeus, H.; Rosillon, D.; Tychon, B.; Ben-Dor, E. Laboratory, field and airborne spectroscopy for monitoring organic carbon content in agricultural soils. Geoderma 2008, 144, 395–404. [Google Scholar] [CrossRef] [Green Version]
- Gray, J.; Karunaratne, S.; Bishop, T.; Wilson, B.; Veeragathipillai, M. Driving factors of soil organic carbon fractions over New South Wales, Australia. Geoderma 2019, 353, 213–226. [Google Scholar] [CrossRef]
- Khormali, F.; Ghergherechi, S.; Kehl, M.; Ayoubi, S. Soil formation in loess-derived soils along a subhumid to humid climate gradient, Northeastern Iran. Geoderma 2012, 179–180, 113–122. [Google Scholar] [CrossRef]
- Pourmasoumi, M.; Khormali, F.; Ayoubi, S.; Kehl, M.; Kiani, F. Development and magnetic properties of loess-derived forest soils along a precipitation gradient in northern Iran. J. Mt. Sci. 2019, 16, 1848–1868. [Google Scholar] [CrossRef]
- Rossi, A.M.; Rabenhorst, M.C. Organic carbon dynamics in soils of Mid-Atlantic barrier island landscapes. Geoderma 2019, 337, 1278–1290. [Google Scholar] [CrossRef]
ML Algorithms | Hyperparameters | Definition | Defined Parameters |
SVM (support vector machines) | Kernel type | the kernel function | RBF |
C | the penalty parameter | 0.01–100 | |
the bandwidth parameter | 0.01–100 | ||
Cubist (regression tree) | committees | the number of model trees | 1–100 |
neighbors | the number of nearest neighbors | 0–9 | |
XGBoost (extreme gradient boosting) | booster | the type of model | gbtree |
max_depth | the depth of tree | 3–10 | |
min_child_weight | the minimum sum of weights of all observations | 0–5 | |
colsample_bytree | the number of variables supplied to a tree | 0.5–1 | |
subsample | the number of samples supplied to a tree | 0.5–1 | |
eta | learning rate | 0.01–0.5 | |
RF (random forest) | Mtry | the number of input variables | 1–30 |
Ntree | the number of trees | 100–3000 | |
ANN (artificial neural networks) | decay | learning rate | 0.001–0.05 |
size | the number of neurons in the hidden layer | 1–10 | |
DNN (deep neural networks) | hidden | the number of hidden layers | 2–10 |
size | the number of neurons in the hidden layer | 15–200 | |
network weight initialization | the initialized weight of networks | uniform/he_normal | |
learning rate | that controls adjusting the weights of the network | 0.001–0.05 | |
dropout regularization | the amount of the neurons that are randomly dropped | 0.2–0.8 |
Min | Max | Mean | SD | CV | Skewness | Kurtosis |
0.02 | 11.48 | 2.19 | 1.27 | 58.23 | 2.33 | 8.2 |
Definition | Res. | Ref. |
Selected Terrain Attributes | ||
Aspect | 30 m | SRTM |
Slope Gradient | 30 m | SAGA GIS |
Elevation | 30 m | SAGA GIS |
Slope Length Factor | 30 m | SAGA GIS |
Valley Depth | 30 m | SAGA GIS |
Openness (PosOpen) | 30 m | SAGA GIS |
Openness (NegOpen) | 30 m | SAGA GIS |
Catchment Slope (CaSLOP) | 30 m | SAGA GIS |
Plane Curvature (Plan.Curv) | 30 m | SAGA GIS |
Topographic Wetness Index (TWI) | 30 m | SAGA GIS |
Channel networks base level (CHNL.BASE) | 30 m | SAGA GIS |
Multiresolution ridge top flatness index (MRRTF) | 30 m | Gallant and Dowling (2003) |
Multiresolution Valley Bottom Flatness Index (MrVBF) | 30 m | |
Selected RS data | ||
Blue band of Landsat-8 (B1) | 30 m | Wulder et al. (2016) |
Green band of Landsat-8 (B2) | 30 m | |
Red band of Landsat-8 (B3) | 30 m | |
Near-infrared band of Landsat-8 (B4) | 30 m | |
Shortwave IR-1 band of Landsat-8 (B5) | 30 m | |
Shortwave IR-2 band of Landsat-8 (B6) | 30 m | |
Normalized difference vegetation index (NDVI) | 30 m | Rouse et al. (1974) |
Enhanced vegetation index (EVI) | 30 m | |
Combined Spectral Response Index (COSRI) | 30 m | |
Transformed SAVI (TSAVI) | 30 m | |
Soil adjusted vegetation index (SAVI) | 30 m | |
Brightness Index | 30 m | Metternicht and Zinck (2003) |
Clay Index | 30 m | Boettinger et al. (2008) |
Carbonate index | ||
MODIS Red | 250 m | |
MODIS Near Infrared (MODIS Nir) | 250 m | |
MODIS Night Temperature(MODIS.Night.Temp) | 1000 m | |
MODIS Day Temperature (MODIS.Day.Temp) | 1000 m | |
Selected climatic data | ||
Annual precipitation (mm) | 1000 m | Fick and Hijmans (2017) |
Annual mean temperature (°C) | 1000 m | Fick and Hijmans (2017) |
Selected categorical data | ||
Land use | 125 m | Banaei et al. (2005) |
Soil map | 500 m |
ML Algorithms | MAE | RMSE | R2 | CCC |
SVM | 0.69 ± 0.07 | 0.87 ± 0.05 | 0.53 ± 0.05 | 0.76 ± 0.05 |
ANN | 0.67 ± 0.08 | 0.85 ± 0.07 | 0.55 ± 0.05 | 0.77 ± 0.06 |
Cubist | 0.66 ± 0.06 | 0.83 ± 0.04 | 0.57 ± 0.04 | 0.78 ± 0.04 |
RF | 0.65 ± 0.03 | 0.82 ± 0.03 | 0.58 ± 0.05 | 0.78 ± 0.03 |
XGB | 0.66 ± 0.04 | 0.83 ± 0.04 | 0.57 ± 0.03 | 0.78 ± 0.04 |
DNN | 0.59 ± 0.06 | 0.75 ± 0.06 | 0.65 ± 0.05 | 0.83 ± 0.06 |
ML Models | All | Number of Points | % | ||||
Inside CI | Outside CI | Inside CI | Outside CI | ||||
5 to 95% | <5% | >95% | 5 to 95% | <5% | >95% | ||
SVM | 1879 | 1490 | 187 | 202 | 79.30 | 9.95 | 10.75 |
ANN | 1879 | 1524 | 165 | 190 | 81.11 | 8.78 | 10.11 |
Cubist | 1879 | 1580 | 155 | 144 | 84.09 | 8.25 | 7.66 |
RF | 1879 | 1559 | 150 | 170 | 82.97 | 7.98 | 9.05 |
XGB | 1879 | 1587 | 140 | 152 | 84.46 | 7.45 | 8.09 |
DNN | 1879 | 1650 | 110 | 119 | 87.81 | 5.85 | 6.33 |
Soil Orders | Mean a | CV (%) | Soil Suborders | Mean a | CV (%) |
Inceptisols | 2.45 B | 33.94 | Aquept | 2.85 C | 20.11 |
Xerepts | 2.06 B | 36.26 | |||
Alfisols | 2.55 B | 33.64 | Aqualfs | 1.94 C | 21.17 |
Udalfs | 3.17 B | 27.44 | |||
Entisols | 2.78 AB | 40.66 | Aquents | 2.51 BC | 14.91 |
Fluvents | 1.91 C | 22.81 | |||
Orthents | 3.93 A | 23.52 | |||
Mollisols | 3.20 A | 30.55 | Aquolls | 2.43 BC | 21.64 |
Rendols | 4.03 A | 22.43 | |||
Udolls | 3.33 B | 24.08 | |||
Xerolls | 2.31 BC | 34.97 | |||
Ultisols | 4.04 A | 12.63 | Humults | 4.04 A | 15.63 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (
Share and Cite
Emadi, M.; Taghizadeh-Mehrjardi, R.; Cherati, A.; Danesh, M.; Mosavi, A.; Scholten, T. Predicting and Mapping of Soil Organic Carbon Using Machine Learning Algorithms in Northern Iran. Remote Sens. 2020, 12, 2234.
Emadi M, Taghizadeh-Mehrjardi R, Cherati A, Danesh M, Mosavi A, Scholten T. Predicting and Mapping of Soil Organic Carbon Using Machine Learning Algorithms in Northern Iran. Remote Sensing. 2020; 12(14):2234.
Chicago/Turabian StyleEmadi, Mostafa, Ruhollah Taghizadeh-Mehrjardi, Ali Cherati, Majid Danesh, Amir Mosavi, and Thomas Scholten. 2020. "Predicting and Mapping of Soil Organic Carbon Using Machine Learning Algorithms in Northern Iran" Remote Sensing 12, no. 14: 2234.
APA StyleEmadi, M., Taghizadeh-Mehrjardi, R., Cherati, A., Danesh, M., Mosavi, A., & Scholten, T. (2020). Predicting and Mapping of Soil Organic Carbon Using Machine Learning Algorithms in Northern Iran. Remote Sensing, 12(14), 2234.