首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于超参数优化CatBoost算法的河流悬浮物浓度遥感反演
引用本文:陈点点,陈芸芝,冯险峰,武爽.基于超参数优化CatBoost算法的河流悬浮物浓度遥感反演[J].地球信息科学,2022,24(4):780-791.
作者姓名:陈点点  陈芸芝  冯险峰  武爽
作者单位:1.福州大学 卫星空间信息技术综合应用国家地方联合工程研究中心 空间数据挖掘与信息共享教育部重点实验室数字中国研究院(福建),福州 3501082.中国科学院地理科学与资源研究所资源与环境信息国家重点实验室,北京 1001013.中国科学院大学,北京 100049
基金项目:中国科学院战略性先导科技专项
摘    要:悬浮物浓度(TSM)是水生态环境评价的重要参数之一,及时掌握河流悬浮物浓度动态变化信息对于内陆水质监测、水环境治理是十分必要的。本研究基于野外实测光谱和悬浮物浓度数据,筛选与悬浮物浓度高度相关的波段组合反射率作为自变量,基于CatBoost、随机森林和多元线性回归算法构建悬浮物浓度遥感反演模型,采用带交叉验证的网格搜索法分别对CatBoost和随机森林2种机器学习模型进行超参数调优,确定模型最优参数配置,并对比不同模型反演精度,确定最优模型。基于最优模型,利用2019—2020年多时相Sentinel-2 MSI遥感影像,反演闽江下游悬浮物浓度,并分析其时空变化特征。结果表明:① b4/b3、(b6-b3)/(b6+b3)、(b4+b8)/b3、(1/b3-1/b4)×b5是MSI反演闽江下游TSM浓度的最佳波段组合反射率; ② 对比其他2种模型,基于超参数优化的CatBoost算法建立的悬浮物反演模型精度最高,其决定系数R²为0.95,均方根误差RMSE和平均绝对百分比误差MAPE分别为15.32 mg/L和19.68%; ③ 2019—2020年闽江下游悬浮物浓度分布“西低东高”,白沙至琅岐入海口呈升高趋势;④ 悬浮物浓度夏季最高,冬季和秋季次之,春季最低。本研究可为闽江下游悬浮物浓度监测及时空变化分析提供一种有效的技术手段和理论参考。

关 键 词:Sentinel-2  MSI  悬浮物  CatBoost  随机森林  多元线性回归  水色遥感  闽江  时空变化分析  
收稿时间:2021-08-03

Retrieving Suspended Matter Concentration in Rivers based on Hyperparameter Optimized CatBoost Algorithm
CHEN Diandian,CHEN Yunzhi,FENG Xianfeng,WU Shuang.Retrieving Suspended Matter Concentration in Rivers based on Hyperparameter Optimized CatBoost Algorithm[J].Geo-information Science,2022,24(4):780-791.
Authors:CHEN Diandian  CHEN Yunzhi  FENG Xianfeng  WU Shuang
Abstract:Total Suspended Matter (TSM) is one of the significant parameters of aquatic ecological environment assessment. It is necessary to grasp the dynamic change information of river suspended solids concentration in time for inland water quality monitoring and water environment management. This paper is based on field measured spectra and suspended matter concentration data, the band combination reflectance that is highly correlated with the concentration of suspended solids is selected as the independent variable. The remote sensing inversion model of suspended solids concentration is constructed based on CatBoost, random forest, and multiple linear regression algorithms. In order to determine the optimal parameter configuration for the models, the grid search method with cross-validation is used for hyperparameter tuning of two machine learning models, i.e., CatBoost and Random Forest, respectively. And the inversion accuracy of different models is compared to determine the optimal model. Based on the optimal model, multi-temporal Sentinel-2 MSI remote sensing images from 2019 to 2020 are used to invert suspended matter concentrations in the lower reaches of the Minjiang River and analyse their spatial and temporal variation characteristics. The results indicate that: ① b4/b3, (b6-b3)/(b6+b3), (b4+b8)/b3, (1/b3-1/b4)×b5 are the best band combination reflectance for MSI inversion of TSM concentrations in the lower Minjiang River; ② Compared with the other two models, the suspended matter concentrations inversion model based on CatBoost algorithm with hyperparameter optimized has the highest accuracy, with a coefficient of determination R2 of 0.95, Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) of 15.32 mg/L and 19.68%, respectively; ③ The distribution of suspended matter concentrations in the lower reaches of the Minjiang River from 2019 to 2020 is "low in the west and high in the east", with a rising trend from Baisha to the mouth of the Langqi inlet; ④ The suspended matter concentration is highest in summer, followed by winter and autumn, and lowest in spring. This study provides an effective technical means and theoretical reference for the monitoring and spatio-temporal variation analysis of suspended matter concentration in the lower reaches of Minjiang River.
Keywords:Sentinel-2 MSI  total suspended matter  CatBoost  Random Forest  multiple linear regression  water color remote sensing  Minjiang  temporal and spatial distribution characteristics  
本文献已被 万方数据 等数据库收录!
点击此处可从《地球信息科学》浏览原始摘要信息
点击此处可从《地球信息科学》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号