首页 | 本学科首页   官方微博 | 高级检索  
     检索      

大规模脉冲星候选体信号的无监督聚类分析研究
引用本文:刘莹,马智,游子毅,王培,党世军,赵汝双,董爱军.大规模脉冲星候选体信号的无监督聚类分析研究[J].天文学报,2022,63(3):36-138.
作者姓名:刘莹  马智  游子毅  王培  党世军  赵汝双  董爱军
作者单位:1. 贵州师范大学物理与电子科学学院;2. 中国科学院国家天文台
基金项目:国家自然科学基金项目(U1731238、U1838108);;贵州省科学技术基金项目(ZK[2022]304)资助;
摘    要:随着500 m口径球面射电望远镜(Five-hundred-meter Aperture Spherical radio Telescope, FAST)等大型射电望远镜的建设和使用,脉冲星巡天数据进入PB时代.为解决如此大量高速采样的标量数据挖掘问题,促进新天文现象的发现,提出一种基于无监督聚类的脉冲星候选体筛选方案.该方案采用基于密度层次、划分方法的混合聚类算法,结合MapReduce/Spark并行计算模型和基于滑动窗口的分组策略,进而提高大量候选体信号筛选的效率.通过在脉冲星数据集HTRU2 (High Time Resolution Universe)上的对比实验,结果表明该算法能取得较高的精确度和召回率,分别是0.946和0.905,并且当并行节点足够时,该算法的时间复杂度相比串行执行明显下降.可见,该方法为脉冲星观测大数据的分析挖掘提供一种可行思路.

关 键 词:脉冲星:普通  数据集:HTRU2  方法:混合聚类  方法:无监督
收稿时间:2021/8/6 0:00:00

Research on Unsupervised Clustering Analysis of Large-scale Pulsar Candidate Signals
LIU Ying,MA Zhi,YOU Zi-yi,WANG Pei,DANG Shi-jun,ZHAO Ru-shuang,DONG Ai-jun.Research on Unsupervised Clustering Analysis of Large-scale Pulsar Candidate Signals[J].Acta Astronomica Sinica,2022,63(3):36-138.
Authors:LIU Ying  MA Zhi  YOU Zi-yi  WANG Pei  DANG Shi-jun  ZHAO Ru-shuang  DONG Ai-jun
Institution:School of Physics and Electronic Science, Guizhou Normal University, Guiyang 550025;National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100012
Abstract:With the construction and use of large radio telescopes such as Five-hundred-meter Aperture Spherical radio Telescope (FAST), pulsar survey data has entered the PB era. To solve the problem of scalar data mining with such a large number of high-speed sampling and promote the discovery of new astronomical phenomena, this paper proposes a pulsar candidate sifting scheme based on unsupervised clustering. This scheme uses a hybrid clustering algorithm based on density hierarchy and division method, combined with MapReduce/Spark parallel computing model and a sliding window-based grouping strategy, thereby improving the efficiency of screening a large number of candidate signals. Comparative experiments on the data set HTRU2 (High Time Resolution Universe) show that the algorithm can achieve higher accuracy and recall rates, which are 0.946 and 0.905, respectively. And when parallel nodes are sufficient, the time complexity of the algorithm is significantly reduced compared to the serial execution method. It can be seen that this method provides a feasible idea for the analysis and mining of big data pulsar observation.
Keywords:pulsar: general  data set: HTRU2 (High Time Resolution Universe)  methods: hybrid clustering  methods: unsupervised
点击此处可从《天文学报》浏览原始摘要信息
点击此处可从《天文学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号