一种使用RoBERTa-BiLSTM-CRF的中文地址解析方法 A Chinese Address Parsing Method Using RoBERTa-BiLSTM-CRF期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种使用RoBERTa-BiLSTM-CRF的中文地址解析方法

引用本文：	张红伟,杜清运,陈张建,张琛. 一种使用RoBERTa-BiLSTM-CRF的中文地址解析方法[J]. 武汉大学学报(信息科学版), 2022, 47(5): 665-672. DOI: 10.13203/j.whugis20210112

作者姓名：	张红伟杜清运陈张建张琛

作者单位：	1.武汉大学电子信息学院，湖北武汉，430072

基金项目：	国家重点研发计划2016YFC0803106

摘要：	针对当前地址匹配方法严重依赖分词词典、无法有效识别地址中的地址元素及其所属类型的问题，提出了使用深度学习的中文地址解析方法，该方法能够对解析后的地址进行标准化和构成分析以改善地址匹配结果。通过对地址的不同词向量表示及不同序列标注模型的对比评估，结果表明，使用双向门递归单元和双向长短时记忆网络对中文地址解析差别较小，稀疏注意力机制有助于提高地址解析的 $F_{1}$ 值。所提出的方法在泛化能力测试集上的 $F_{1}$ 值达到了0.940，在普通测试集上的 $F_{1}$ 值达到了0.968。
关键词：	地址解析中文地址分词注意力机制长短时记忆网络 RoBERTa BiLSTM CRF
收稿时间：	2021-06-08
A Chinese Address Parsing Method Using RoBERTa-BiLSTM-CRF

Affiliation:	1.School of Electronic Information, Wuhan University, Wuhan 430072, China2.School of Resources and Environmental Sciences, Wuhan University, Wuhan 430079, China3.Zhejiang Academy of Surveying and Mapping, Hangzhou 311100, China

Abstract:	Objectives Aiming at the problems that current address matching relies heavily on word segmentation dictionary and cannot effectively recognize address elements in addresses and their types, a Chinese address parsing method based on deep learning is proposed. Methods The model combining robustly optimized bidirectional encoder representations from transformers(BERT) transformers pretraining approach(RoBERTa), bidirectional long short-term memory(BiLSTM), and conditional random field(CRF) is used to parse Chinese addresses. Firstly, the RoBERTa model is used to obtain the word vector representation of the address. Secondly, BiLSTM is used to learn the address model features and contextual information. Finally, CRF is used to construct the constraint relations between the labels. Results Through the comparison and evaluation of different word vector representations of addresses and different sequence labeling models, the proposed method in this study achieves the maximum value of 0.940 on the generalization ability test dataset, and the precision, recall rate, and F₁-score of the correspond?ing test dataset reach 0.962, 0.974, and 0.968. Conclusions The method proposed in this paper does not need to extract the address model features, nor does it rely on word segmentation dictionary for address segmentation. Address elements are recognized by learning address context information and address model features. The model generalization ability test dataset used in this study can effectively test whether the model is overfitted. There is little difference between bidirectional gated recurrent unit (BiGRU) and BiLSTM for Chinese address resolution, and the sparse attention mechanism helps to improve the accuracy of address resolution.

Keywords:

	点击此处可从《武汉大学学报(信息科学版)》浏览原始摘要信息
	点击此处可从《武汉大学学报(信息科学版)》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏