首页 | 本学科首页   官方微博 | 高级检索  
     

多源地址要素可信度评估——以道路要素为例
引用本文:孙立财,陈以松,熊杰,罗安,王勇. 多源地址要素可信度评估——以道路要素为例[J]. 测绘通报, 2021, 0(10): 108-113. DOI: 10.13474/j.cnki.11-2246.2021.315
作者姓名:孙立财  陈以松  熊杰  罗安  王勇
作者单位:1. 兰州交通大学测绘与地理信息学院, 甘肃 兰州 730070;2. 中国测绘科学研究院, 北京 100036;3. 地理国情监测技术应用国家地方联合工程研究中心, 甘肃 兰州 730070;4. 甘肃省地理国情监测工程实验室, 甘肃 兰州 730070;5. 中国电信股份有限公司四川分公司, 四川 成都 610015
基金项目:兰州交通大学优秀平台(201806);国家重点研发计划(2017YFB0503502;2017YBF0503601);中国测绘科学研究院基本科研业务费项目(AR2011)
摘    要:随着自发地理信息和中文地址要素切分技术的发展,地址要素的质量有待评价。本文针对中文地址文本切分产生的地址要素质量难以有效评价的问题,提出了一种多源数据和网络检索支持下的地址要素可信度评估方法。首先利用中文分词工具对地址要素进行分词与词性标注,通过分析词频和词性组合模式,对地址要素的命名结构进行可信度计算。其次基于大规模的地址样本、道路数据及POI数据,挖掘多源数据对地址要素的数据支撑,计算数据支持度。然后利用搜索引擎对地址要素进行快速检索,分析搜索结果与数量,对地址要素的网络可信度进行计算。最后提出一种地址要素综合可信度计算模型,实现地址要素的综合可信度计算。试验结果表明,该模型与方法不仅能够高效快速地计算中文地址文本中地址要素的可信度,还能够有效发现地址要素中存在的偏僻、虚假等相关问题,为地址要素的自动化检测与标准化处理提供参考。

关 键 词:多源数据  地址要素  可信度评估  中文分词  归一化  
收稿时间:2021-01-18

Evaluation of the credibility of multi-source address elements:a case study of road feature
SUN Licai,CHEN Yisong,XIONG Jie,LUO An,WANG Yong. Evaluation of the credibility of multi-source address elements:a case study of road feature[J]. Bulletin of Surveying and Mapping, 2021, 0(10): 108-113. DOI: 10.13474/j.cnki.11-2246.2021.315
Authors:SUN Licai  CHEN Yisong  XIONG Jie  LUO An  WANG Yong
Affiliation:1. Faculty of Geomatics, Lanzhou Jiaotong University, Lanzhou 730070, China;2. Chinese Academy of Surveying & Mapping, Beijing 100036, China;3. National-Local Joint Engineering Research Center of Technologies and Applications for National Geographic State Monitoring, Lanzhou 730070, China;4. Gansu Provincial Engineering Laboratory for National Geographic State Monitoring, Lanzhou 730070, China;5. China Telecom Corporation Limited Sichuan Branch, Chengdu 610015, China
Abstract:With the development of spontaneous geographic information and Chinese address element segmentation technology, the quality of address elements needs to be evaluated. Aiming at the problem that the quality of address elements produced by Chinese address text segmentation is difficult to effectively evaluate, this paper proposes a method for evaluating the credibility of address elements supported by multi-source data and network retrieval. Firstly, the Chinese word segmentation tool is used to segment the address elements and part-of-speech tagging. By analyzing the word frequency and part-of-speech combination mode, the credibility of the naming structure of the address elements is calculated. Then, based on large-scale address samples, road data, and POI data, excavate the data support of multi-source data to address elements, and calculate the data support. Then use the search engine to retrieve the address elements quickly, analyze the search results and quantity, and calculate the network credibility of the address elements. Finally, a comprehensive credibility calculation model for address elements is proposed to realize the comprehensive credibility calculation of address elements. Experimental results show that the model and method can not only efficiently and quickly calculate the credibility of address elements in Chinese address texts, but also effectively discover the remoteness and falsehood of address elements, which provides a reference for the automatic detection and standardization of address elements.
Keywords:multi-source data  credibility evaluation  Chinese word segmentation  address element  information normalize  
本文献已被 万方数据 等数据库收录!
点击此处可从《测绘通报》浏览原始摘要信息
点击此处可从《测绘通报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号