首页 | 本学科首页   官方微博 | 高级检索  
     检索      


A hybrid method for Chinese address segmentation
Authors:Lin Li  Wei Wang  Biao He  Yu Zhang
Institution:1. School of Resource and Environmental Sciences, Wuhan University, Wuhan, China;2. Key Laboratory of Urban Land Resources Monitoring and Simulation, Ministry of Land and Resources, Shenzhen, China;3. Collaborative Innovation Center of Geospatial Technology, Wuhan University, Wuhan, China;4. Key Laboratory of Urban Land Resources Monitoring and Simulation, Ministry of Land and Resources, Shenzhen, China;5. College of Architecture and Urban Planning, Shenzhen University, Shenzhen, China
Abstract:Chinese address segmentation is a serious challenge in geographic information system geocoding. Most previous studies have relied on predefined gazetteers without considering the information contained by a raw address corpus. In this paper, a hybrid method employing both rule-based and statistical methods is proposed for Chinese address segmentation without a predefined gazetteer. This approach utilizes statistical methods to extract address information from a raw address corpus and a rule-based method to segment Chinese addresses. Two typical statistical methods and their combinations with rule-based methods are compared with the hybrid method in an experiment involving approximately 460,000 address items in Shenzhen City, China. The experimental results indicate that the proposed method achieves an F-score of over 0.8, which is better than those of existing methods, thus validating the proposed method.
Keywords:Geocoding  Chinese address segmentation without gazetteers  hybrid method
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号