首页 | 本学科首页   官方微博 | 高级检索  
     检索      

卷积神经网络和视觉注意力语义分割模型在高分辨率遥感影像分类中的性能分析
引用本文:朱叶飞,秦小麟,詹雅婷,等.卷积神经网络和视觉注意力语义分割模型在高分辨率遥感影像分类中的性能分析[J].地质学刊,2023,47(3):271-281.
作者姓名:朱叶飞  秦小麟  詹雅婷  
基金项目:江苏省海洋科技创新专项资金项目“江苏省海域海岛使用遥感动态监测研究”(HY2019-2)、“协同多源遥感数据的江苏海岸带资源遥感监测方法研究与应用示范”(JSZRHYKJ202207), 江苏省自然资源科技专项资金项目“基于深度学习的自然资源遥感影像分类技术研究”(KJXM2019034)
摘    要:随着深度学习语义分割的快速发展,基于计算机视觉语义分割模型的高分辨率遥感影像分类方法也大量涌现。为系统定量地研究经典的和先进的视觉语义分割模型在遥感影像分类中的性能,在总结深度学习语义分割进展的基础上,选择9种基于卷积神经网络(CNN)和视觉注意力的语义分割算法,对米级和厘米级2个尺度的遥感数据集进行分析研究。在模型构建上基于计算机视觉通用的语义分割框架,训练时采用红绿蓝3波段遥感图像并基于ImageNet预训练权重进行迁移学习训练。研究结果表明:通用的语义分割模型通过常规训练设置进行训练能取得较好的遥感影像分类效果,部分地物的交并比(IoU)可以达到90%以上;基于视觉注意力的遥感影像分类模型的精度普遍高于基于CNN的模型,且MaskFormer能更有效地提取离散的地物信息;不同类别的精度最高值并不全在总体最优模型中,部分会存在于次优模型中;类似的地物在更高分辨率遥感数据集中可以获得更高的精度。

关 键 词:遥感影像分类  深度学习  语义分割  视觉注意力  卷积神经网络  性能分析

Performance analysis of convolutional neural network and visual attention semantic segmentation model in high resolution remote sensing image classification
Zhu Yefei,Qin Xiaolin,Zhan Yating,et al..Performance analysis of convolutional neural network and visual attention semantic segmentation model in high resolution remote sensing image classification[J].Jiangsu Geology,2023,47(3):271-281.
Authors:Zhu Yefei  Qin Xiaolin  Zhan Yating  
Abstract:With the rapid development of deep learning semantic segmentation, high-resolution remote sensing classification methods based on computer visual semantic segmentation models have also emerged in large numbers. To quantitatively and systematically analyze the performance of classic and advanced visual semantic segmentation models in remote sensing classification, this paper selects and analyzes 9 semantic segmentation algorithms of convolutional neural network (CNN) and visual attention on remote sensing datasets at the meter and centimeter scales on the basis of summarizing the semantic segmentation progress in deep learning in recent years. The model construction is based on the general semantic segmentation framework of computer vision. During the training, only red, green and blue band remote sensing images were used, and ImageNet pre-training weights were used for migration learning training. The research results show that, firstly, the general semantic segmentation model can achieve better remote sensing classification effect by training with conventional training settings, and the IoU of some ground features can reach more than 90%. Secondly, the accuracy of the remote sensing classification model based on the visual attention is generally higher than that based on the CNN model, and MaskFormer can more effectively extract instance-level feature information. Thirdly, the highest value of class accuracy is not all included in the overall optimal model, and some of them also exist in the sub-optimal model. Fourthly, similar ground features can achieve higher accuracy in higher resolution dataset.
Keywords:remote sensing image classification  deep learning  semantic segmentation  visual attention  convolutional neural network  performance analysis
点击此处可从《地质学刊》浏览原始摘要信息
点击此处可从《地质学刊》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号