首页 | 官方网站   微博 | 高级检索  
     

BCCAGCM模式在神威·太湖之光系统的优化
引用本文:魏敏,王彬,何香,孙俊,姜小成,肖洒,张莉,徐金秀.BCCAGCM模式在神威·太湖之光系统的优化[J].应用气象学报,2019,30(4):502-512.
作者姓名:魏敏  王彬  何香  孙俊  姜小成  肖洒  张莉  徐金秀
作者单位:1.国家气象信息中心, 北京 100081
基金项目:公益性行业(气象)科研专项(GYHY201306062),国家重点研究发展计划(2016YFA0602102)
摘    要:开展气象数值模式在神威·太湖之光系统的移植与优化,对研究模式与新型计算架构的适应性有重要意义。该文以BCCAGCM模式为研究对象,将其移植到神威·太湖之光全国产异构众核计算系统,进行性能分析,对模式动力框架和物理过程计算结构进行调整,将计算核心段采用OpenACC技术进行众核加速优化,大量代码进行算法重构。结果表明:各核心段计算效率基本达到未优化的3倍左右,最高可达14倍左右,将各核心段集成,形成异构众核集成版本,可正确、稳定运行,计算误差合理。在不同并行规模,采用从核对模式整体计算进行加速效果比较稳定,基本保持在1.9倍,26000核并行规模动力试验并行效率约70%,其他试验约为57%。

关 键 词:BCCAGCM    神威·太湖之光    异构计算    众核
收稿时间:2019/3/17 0:00:00
修稿时间:2019/5/8 0:00:00

Optimizing BCCAGCM on Sunway TaihuLight
Wei Min,Wang Bin,He Xiang,Sun Jun,Jiang Xiaocheng,Xiao S,Zhang Li and Xu Jinxiu.Optimizing BCCAGCM on Sunway TaihuLight[J].Quarterly Journal of Applied Meteorology,2019,30(4):502-512.
Authors:Wei Min  Wang Bin  He Xiang  Sun Jun  Jiang Xiaocheng  Xiao S  Zhang Li and Xu Jinxiu
Affiliation:1.National Meteorological Information Center, Beijing 1000812.Jiangnan Institute of Computing Technology, Wuxi 2140833.National Climate Center, Beijing 100081
Abstract:With the rise of many-core processors such as Intel MIC, GPU and SW26010, the architecture of supercomputer systems has undergone great changes. The supercomputer transitions from a homogeneous system containing only multi-core CPUs to a heterogeneous system with coexistence of CPU and many-core accelerators. Heterogeneous architectures provide powerful computing power for large, complex applications. However, since the numerical model is basically based on conventional CPU development different from the many-core accelerator, the existing tens of thousands of lines of legacy code cannot take full advantage of the parallel computing capacity of the new architecture. Carrying out the porting and optimization of the weather and climate numerical model on the new system is of great significance to improve the adaptability of the model in the new computing architecture.
The Sunway TaihuLight System is the world''s first supercomputer with a peak performance greater than 100 PFlops based on homegrown SW26010 heterogeneous many-core chip. Each SW26010 processor consists of management processing elements (MPEs) and clusters of computing processing elements (CPEs). To support parallel computing for heterogeneous architectures, the system provides a set of compilation tools, including basic C/C++, Fortran compilers. In addition to that, there is also a customized Sunway OpenACC tool that supports the OpenACC2.0 syntax.
As the atmospheric component of BCCCSM, BCCAGCM is the most computationally expensive component in typical configurations. Since BCCAGCM has not been operated in the Sunway system, BCCAGCM is first ported to the Sunway system, using only MPE to perform the computation. And then, the calculation framework is analyzed to determine the major kernels that take the most time to calculate. BCCAGCM uses a hybrid parallelization scheme combining MPI and OpenMP to complete the calculation. In the Sunway system, MPI and OpenACC are used to obtain appropriate parallelism from the CPE cluster. On one hand, by adjusting the computational sequence and the loop structures to aggregate more parallel computations, the parallelism from the CPE cluster is fully utilized. On the other hand, the design optimizes data access and transmission strategy, improves the LDM availability, and minimizes the proportion of data moving and computation overhead.
The efficiency of the MPE+CPE heterogeneous calculation after optimization is compared with the calculation efficiency of the original MPE only. The optimized kernel calculation efficiency is basically about 3 times as before, and up to about 14 times. Kernels are integrated, and the new version is integrated with a computing efficiency of 1.9 times as before. Although the overall acceleration effect of the model is not very obvious, the formation of the BCCAGCM heterogeneous many-core basic version add to the experience for the optimization and refactoring of the new computing architecture for the meteorological numerical model.
Keywords:BCCAGCM  Sunway TaihuLight supercomputer  heterogeneous computing  many-core
本文献已被 CNKI 等数据库收录!
点击此处可从《应用气象学报》浏览原始摘要信息
点击此处可从《应用气象学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号