Gully erosion spatial modelling: Role of machine learning algorithms in selection of the best controlling factors and modelling process |
| |
Affiliation: | 1. Department of Natural Resources and Environmental Engineering, College of Agriculture, Shiraz University, Shiraz, Iran;2. Department of Geography, School of Earth Science, Bharathidasan University, Tiruchirappalli, 620 024, Tamil Nadu, India;3. Department of Watershed and Arid Zone Management, Gorgan University of Agricultural Sciences and Natural Resources, Gorgan, Iran;4. Sustainable Agriculture Sciences, Rothamsted Research, North Wyke, Okehampton, Devon, EX20 2SB, UK |
| |
Abstract: | This investigation assessed the efficacy of 10 widely used machine learning algorithms (MLA) comprising the least absolute shrinkage and selection operator (LASSO), generalized linear model (GLM), stepwise generalized linear model (SGLM), elastic net (ENET), partial least square (PLS), ridge regression, support vector machine (SVM), classification and regression trees (CART), bagged CART, and random forest (RF) for gully erosion susceptibility mapping (GESM) in Iran. The location of 462 previously existing gully erosion sites were mapped through widespread field investigations, of which 70% (323) and 30% (139) of observations were arbitrarily divided for algorithm calibration and validation. Twelve controlling factors for gully erosion, namely, soil texture, annual mean rainfall, digital elevation model (DEM), drainage density, slope, lithology, topographic wetness index (TWI), distance from rivers, aspect, distance from roads, plan curvature, and profile curvature were ranked in terms of their importance using each MLA. The MLA were compared using a training dataset for gully erosion and statistical measures such as RMSE (root mean square error), MAE (mean absolute error), and R-squared. Based on the comparisons among MLA, the RF algorithm exhibited the minimum RMSE and MAE and the maximum value of R-squared, and was therefore selected as the best model. The variable importance evaluation using the RF model revealed that distance from rivers had the highest significance in influencing the occurrence of gully erosion whereas plan curvature had the least importance. According to the GESM generated using RF, most of the study area is predicted to have a low (53.72%) or moderate (29.65%) susceptibility to gully erosion, whereas only a small area is identified to have a high (12.56%) or very high (4.07%) susceptibility. The outcome generated by RF model is validated using the ROC (Receiver Operating Characteristics) curve approach, which returned an area under the curve (AUC) of 0.985, proving the excellent forecasting ability of the model. The GESM prepared using the RF algorithm can aid decision-makers in targeting remedial actions for minimizing the damage caused by gully erosion. |
| |
Keywords: | Machine learning algorithm Gully erosion Random forest Controlling factors Variable importance |
本文献已被 ScienceDirect 等数据库收录! |
|