首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一种中文地址知识库支撑的中文地址分词算法
引用本文:赵成,李滨.一种中文地址知识库支撑的中文地址分词算法[J].测绘科学技术学报,2017(6):639-643,648.
作者姓名:赵成  李滨
作者单位:1. 信息工程大学,河南郑州,450001;2. 河南工业大学信息科学与工程学院,河南郑州,450001
基金项目:河南省科技攻关项目(162102310612),河南省教育厅科学技术重点研究项目(15A420004)
摘    要:针对中文地址非结构化、不规范的特点,在构建中文地址模版和中文地址词典等重要知识库基础上,基于中文地址模版引入了中文地址分词的预处理;并在中文地址词典的支撑下采用逆向最大匹配算法实现了中文地址分词。新中文地址分词算法不仅在正确率和召回率等指标上优于传统算法,更提出了一种用于解决未登录地址名词识别问题的新方法。

关 键 词:中文地址  中文地址知识库  中文地址分词  逆向最大匹配算法  未登录地址名词

Chinese Address Segmentation Algorithm Based on Chinese Address Knowledge Bases
Abstract:According to the unstructured and non-standard features of Chinese addresses,the important knowledge bases include Chinese address templates and Chinese address dictionary are built.The preprocessing of Chinese address segmentation is introduced based on Chinese address template.Chinese address segmentation is realized using the reverse maximum matching algorithm based on Chinese address dictionary.The new segmentation algorithm is not only better than the traditional algorithm in terms of accuracy rate,recall rate and other indicators,but also a new method to solve the identification problem of unregistered address nouns.
Keywords:Chinese address  Chinese address knowledge bases  Chinese address segmentation  reverse maximum matching algorithm  unregistered address nouns
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号