首页 | 本学科首页   官方微博 | 高级检索  
     

基于SHAP值机器学习的江西暖季暴雨预报因子重要性分析
引用本文:夏侯杰,肖安. 基于SHAP值机器学习的江西暖季暴雨预报因子重要性分析[J]. 气象与减灾研究, 2024, 47(1): 12-23
作者姓名:夏侯杰  肖安
作者单位:江西省气象台
基金项目:江西省气象局重点研究项目(编号:JX2020Z04);江西省科技厅重点研发项目(编号:20203BBGL73223);中国气象局创新发展专项(编号:CXFZ2021Z012).
摘    要:机器学习模型(Machine Learning,ML)的不可解释性给其在气象业务中的应用带来了挑战。模型解释和可视化是解决这一问题的有效途径。文中将SHAP值应用于天气预报ML模型解释,研究了江西省暖季暴雨模型的预报因子对预报结果的影响。分别选取2016—2020年、2021—2022年4—9月ECWMF(European Centre for Medium-Range Weather Forecasts)高分辨率数值模式物理量及国家站降水观测数据进行XGBoost 建模与模型解释。结果表明,全局重要性排名前4位依次是总降水(重要性42.70%)、850 hPa比湿(重要性11.17%)、925 hPa相对湿度(重要性10.44%)、500 hPa相对湿度(重要性 9.16%)。个例分析表明,命中个例中高重要性物理因子在暴雨区的 SHAP 值较大,漏报(空报)个例在漏报(空报)区域高重要性物理因子的SHAP值偏小(偏大)。SHAP值从全局和局部可定量给出ML模型有物理意义的解释,解释结果与天气学原理和业务经验较一致,有利于ML在气象业务中的深入应用。

关 键 词:SHAP值  机器学习  暴雨  因子重要性  可解释性
收稿时间:2023-09-04
修稿时间:2023-11-23

Importance analysis on warm season rainstorm forecast factors in Jiangxi Province based on machine learningmodel of shapely values
Xia Houjie,Xiao An. Importance analysis on warm season rainstorm forecast factors in Jiangxi Province based on machine learningmodel of shapely values[J]. Meteorology and Disaster Reduction Research, 2024, 47(1): 12-23
Authors:Xia Houjie  Xiao An
Affiliation:Jiangxi Meteorological Observatory
Abstract:The inability to understand how Machine Learning(ML)makes its predictions brings great challenges for its application in day-to-day weather forecast operations. Model interpretation and visualization(MIV)is the key to solve this problem. In this paper, the shapely values(SV)were applied to MIV of ML model of warm season rainstorm forecasting, and then the impact of forecast factors on forecast results by the warm season rainstorm forecast model were discussed. The ECWMF(European Centre for Medium-Range Weather Forecasts)high- resolution numerical model(EC model)output products and precipitation records of national weather stations in Jiangxi province April to September from 2016 to 2022 years were selected for model training and MIV. Results showed that the top four places of global importance were total precipitation with the importance of 42.70%, the specific humidity of 850 hPa withthe importance of 11.17%, the relative humidity at 925 hPa with the importance of 10.44%, and the relative humidity at 500 hPa with the importance of 9.16%, respectively. The application of SV to the forecasting of weather cases showed that the SV of the high importance physical factors in the rainstorm area were larger in the hit cases, but the SV of these factors with high importance in the miss report(false alarm report)area were smaller(larger)in the miss(false alarm)cases. It indicated that the SV were enabled to explain the ML model in quantitatively from global understanding to local explanations of each prediction. The explanations made by SV were consistent with physical rules and weather forecast experience which benefited the development of ML in weather-forecasting sciences.
Keywords:shapely values   machine learning   rainstorm   factor importance   interpretability
点击此处可从《气象与减灾研究》浏览原始摘要信息
点击此处可从《气象与减灾研究》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号