首页 | 本学科首页   官方微博 | 高级检索  
     

基于机器学习的贫困地区识别算法对比——以陕西省为例
引用本文:赵雨,白宇,员学锋. 基于机器学习的贫困地区识别算法对比——以陕西省为例[J]. 地理科学, 2022, 42(8): 1421-1432. DOI: 10.13249/j.cnki.sgs.2022.08.010
作者姓名:赵雨  白宇  员学锋
作者单位:长安大学土地工程学院,陕西西安710054;长安大学地球科学与资源学院,陕西西安710054
基金项目:中央高校基本科研业务费专项资金资助(300102270207)
摘    要:以传统社会经济指标为主导的贫困识别依赖于详尽的普查抽查数据,收集和处理不同质量和数量的普查抽查数据来研究区域贫困需要耗费大量的人力物力和时间,难以快速动态地监测贫困状态。然而时间分辨率高且客观易获取的夜间灯光数据可以在一定程度上弥补统计数据的劣势,即时地反映地表社会经济现象。机器学习算法能够从这些数据中学习出规律和模式,从中挖掘出潜在信息来识别贫困地区。基于陕西省NPP-VIIRS夜间灯光数据,通过构造多维统计变量,利用逻辑回归、支持向量机、K近邻、随机森林、决策树和梯度提升树6种监督分类算法识别贫困地区。结果表明从夜间灯光数据提取的多维特征能够更好的应用于贫困地区的识别,6种算法都能够准确的识别贫困地区,分类结果在空间上具有相似性,且表现出一定的地域性,分类准确度达到76.82%~83.20%。根据混淆矩阵进一步对比各个算法的特点,认为随机森林算法在误差偏移和分类精度等方面综合表现最佳。

关 键 词:贫困识别  机器学习  算法对比  特征工程  夜间灯光数据
收稿时间:2021-03-23
修稿时间:2021-11-02

Comparison of Machine Learning Algorithms for Identifying Poverty-stricken Regions: A Case of Shaanxi
Zhao Yu,Bai Yu,Yuan Xuefeng. Comparison of Machine Learning Algorithms for Identifying Poverty-stricken Regions: A Case of Shaanxi[J]. Scientia Geographica Sinica, 2022, 42(8): 1421-1432. DOI: 10.13249/j.cnki.sgs.2022.08.010
Authors:Zhao Yu  Bai Yu  Yuan Xuefeng
Affiliation:1. School of Land Engineering, Chang’an University, Xi’an 710054, Shaanxi, China
2. School of Earth Science and Resources, Chang’an University, Xi’an 710054, Shaanxi, China
Abstract:Improving the platform of monitoring poverty-returning and making full use of advanced technical means to optimize the accuracy of monitoring is crucial to consolidate the existing achievements and build a solid foundation for rural revitalization. Poverty identification led by traditional socio-economic indicators depends on detailed census data. Collecting and processing census data with different qualities and quantities to research regional poverty requires a lot of expenditures, and it is difficult to monitor poverty quickly and dynamically. However, the night light data with high time-resolution and easy obtaining can to a certain extent make up for the disadvantage of statistical data and reflect the surface socio-economic phenomena in real-time. Machine learning can learn rules and patterns from these data and mine potential information to identify poor areas. Based on this, taking Shaanxi Province as an example, six supervised learning classification algorithms, such as Logical Regression (LR), Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Random Forest Classifier (RFC), Decision Tree (DT) and Gradient Boosting Decision Tree (GBDT), are used to identify poor areas based on the rich features of nighttime light data. As a result, it shows that: 1) the multidimensional variables have about 5% higher simulation accuracy than single variables, and the accuracy can lift 1%-5% after downscaling, which can be better applied to identifying poverty-stricken regions. In addition, these variables perform similarly in feature importance, which indicate that these features are used reasonably in these algorithms. 2) All six machine learning algorithms can precisely classify poverty-stricken regions based on certain prior knowledge with an accuracy of about 80%, also have a stronger spatial consistency and regionalism. 3) The classification accuracy, F1-score, and Kappa coefficient are used to compare the characteristics between these selected algorithms, and the RFC algorithm ranks in first, the KNN, GBDT, and LR performed moderately, and the SVM and DT get poorer performance. Poverty-stricken regions act as a key region of rural revitalization, so establishing a dynamic system to instantly monitor its relative poverty level in real-time is crucial to enhance the existing poverty alleviation achievements, achieve long-term stable poverty eradication, and build a solid foundation for rural revitalization. The purpose of this paper is to find an auxiliary means to achieve the identification of poverty-stricken regions in an efficient and real-time manner, so as to provide some references for using machine learning algorithms for the dynamic monitoring of poverty-stricken regions.
Keywords:poverty identify  machine learning  algorithm comparison  feature engineering  nighttime light data  
本文献已被 万方数据 等数据库收录!
点击此处可从《地理科学》浏览原始摘要信息
点击此处可从《地理科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号