首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于独立成分分析和随机森林算法的城镇用地提取研究
引用本文:蒲东川,王桂周,张兆明,牛雪峰,何国金,龙腾飞,尹然宇,江威,孙嘉悦.基于独立成分分析和随机森林算法的城镇用地提取研究[J].地球信息科学,2020,22(8):1597-1606.
作者姓名:蒲东川  王桂周  张兆明  牛雪峰  何国金  龙腾飞  尹然宇  江威  孙嘉悦
作者单位:1.中国科学院空天信息创新研究院, 北京 1000942.吉林大学地球探测科学与技术学院, 长春 1300263.海南省地球观测重点实验室, 三亚 5720294.三亚中科遥感研究所, 三亚 572029
基金项目:国家自然科学基金重点项目(61731022);中科院A类先导专项(XDA19090300);国家重点研发计划课题(2016YFA0600302);国家重点研发计划课题(2016YFB0501502)
摘    要:城镇用地信息是联合国2030年可持续发展议程关注的重点之一。城市在世界范围内迅速扩张,快速准确地获取城镇用地信息对于政府决策具有重要作用。城镇土地覆盖信息非常复杂,包括人工建筑、树木、草地、水体等多种地表覆盖类型。基于传统人工测绘获取城镇用地信息费时费力并且难于及时更新。Landsat等遥感卫星数据为城镇用地信息提取提供了丰富的数据源。基于卫星遥感数据提取的城镇用地信息可以为未来城市的建设和管理提供基础的科学决策数据。基于监督分类方法和卫星遥感数据可快速地提取城镇用地信息,然而特征变量的选择对于高精度城镇用地信息提取尤为重要。为研究不同特征变量组合对于城镇用地信息提取的影响,以北京市为研究区,以2017年7月10日获取的Landsat 8 OLI影像为数据源,通过数据预处理、纹理提取、独立成分分析、主成分分析等得到4个维度的29个特征,选取了7种特征组合方案进行城镇用地提取。考虑随机森林算法性能稳定,分类精度高和可以方便进行特征重要性评价等优点,选择其作为监督分类算法以提取城镇用地信息,并进行了精度评定,以确定最优的城镇用地提取特征组合。研究发现:综合利用光谱特征和独立成分分析后的影像特征,提取城镇用地的总体精度为93.1%,Kappa系数为0.86,优于利用其他特征的提取结果;基于随机森林算法对数据进行训练后输出的各变量的归一化变量重要性与特征均值的标准差结果存在相似性,利用随机森林算法的变量重要性估计与特征均值折线图都可以进行变量重要性评价。

关 键 词:随机森林  独立成分分析  主成分分析  灰度共生矩阵  卫星遥感  城镇用地  Landsat  8  特征重要性  
收稿时间:2019-07-19

Urban Area Extraction based on Independent Component Analysis and Random Forest Algorithm
PU Dongchuan,WANG Guizhou,ZHANG Zhaoming,NIU Xuefeng,HE Guojin,LONG Tengfei,YIN Ranyu,JIANG Wei,SUN Jiayue.Urban Area Extraction based on Independent Component Analysis and Random Forest Algorithm[J].Geo-information Science,2020,22(8):1597-1606.
Authors:PU Dongchuan  WANG Guizhou  ZHANG Zhaoming  NIU Xuefeng  HE Guojin  LONG Tengfei  YIN Ranyu  JIANG Wei  SUN Jiayue
Institution:1. Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China2. College of Geo-exploration Science and Technology, Jilin University, Changchun 130026, China3. Key Laboratory of Earth Observation Hainan Province, Sanya 572029, China4. Sanya Institute of Remote Sensing, Sanya 572029, China
Abstract:Urban area information is of great significance for human development, in the 2030 United Nations (UN) Sustainable Development Agenda. Urban area expanded rapidly in many places of the world. Accurate and timely urban area information is very important for decision makers. However, land cover in urban area is highly complex, including artificial buildings, trees, grasslands, water bodies, etc. Extraction of urban land cover information based on traditional manual survey is time-consuming and difficult to update in time. Free access to remote sensing satellite data such as Landsat provides a rich source of data for urban area extraction. Urban area information extracted from space borne remote sensing images can provide basic scientific data for decision-making and city construction and management. Based on supervised classification method and satellite remote sensing data, it is possible to extract urban areas fast. However, choosing appropriate feature variables is very important for obtaining accurate urban area extraction result, especially linear correlations between different features has a significant impact on the extraction accuracy. After implementing independent component analysis (ICA) transformation to satellite remote sensing image data, linearly independent feature variables can be obtained, therefore accuracy of urban area extraction can be effectively improved. Taking Beijing city as the study area and Landsat 8 Operational Land Imager (OLI) imagery (path/row: 123/32) acquired on July 10th, 2017 as the experimental data, preprocessing, texture extraction, independent component analysis, and principal component analysis were performed, 29 features in 4 dimensions and 7 feature variable combinations were selected. Then, Random Forest (RF) algorithm was chosen for urban area extraction owing to its stable performance, high classification accuracy and feature importance evaluation capability. Based on the random forest algorithm, feature importance evaluation, urban area extraction, and accuracy assessment were carried out to determine the optimal feature combination for urban area extraction. It was found that: (1) the overall accuracy of urban area extraction with spectral and ICA transformed features is 93.1% and the Kappa coefficient is 0.86, which is superior to the results with other features; (2) Based on the random forest algorithm, the data is trained to obtain normalized importance of each feature. There is a similarity between the normalized importance of features and the standard deviation of mean values of the features, indicating that the importance estimate of features has a close relationship with the standard deviation of mean values of the features and both can be used to estimate importance of the variables.
Keywords:random forest  independent component analysis  principal component analysis  gray level co-occurrence matrix  satellite remote sensing  urban area  Landsat 8  feature importance  
本文献已被 CNKI 等数据库收录!
点击此处可从《地球信息科学》浏览原始摘要信息
点击此处可从《地球信息科学》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号