历史数据和强化学习相结合的低频轨迹数据匹配算法 A Low-Sampling-Rate Trajectory Matching Algorithm in Combination of History Trajectory and Reinforcement Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

历史数据和强化学习相结合的低频轨迹数据匹配算法

引用本文：	孙文彬,熊婷.历史数据和强化学习相结合的低频轨迹数据匹配算法[J].测绘学报,2016,45(11):1328-1334.

作者姓名：	孙文彬熊婷

作者单位：	中国矿业大学(北京)地球科学与测绘工程学院, 北京 100083

基金项目：	国家自然科学基金(41671383)Foundation support:The National Natural Science Foundation of China (41671383)

摘要：	针对低频(采样间隔大于1min)轨迹数据匹配算法精度不高的问题,提出了一种基于强化学习和历史轨迹的匹配算法HMDP-Q,首先通过增量匹配算法提取历史路径作为历史参考经验库;根据历史参考经验库、最短路径和可达性筛选候选路径集;再将地图匹配过程建模成马尔科夫决策过程,利用轨迹点偏离道路距离和历史轨迹构建回报函数;然后借助强化学习算法求解马尔科夫决策过程的最大回报值,即轨迹与道路的最优匹配结果;最后应用某市浮动车轨迹数据进行试验。结果表明:本文算法能有效提高轨迹数据与道路匹配精度;本算法在1min低频采样间隔下轨迹匹配准确率达到了89.2%;采样频率为16min时,该算法匹配精度也能达到61.4%;与IVVM算法相比,HMDP-Q算法匹配精度和求解效率均优于IVVM算法,16min采样频率时本文算法轨迹匹配精度提高了26%。
关键词：	低频浮动车数据轨迹匹配马尔科夫决策过程强化学习
收稿时间：	2016-02-01
修稿时间：	2016-10-01
A Low-Sampling-Rate Trajectory Matching Algorithm in Combination of History Trajectory and Reinforcement Learning

SUN Wenbin,XIONG Ting.A Low-Sampling-Rate Trajectory Matching Algorithm in Combination of History Trajectory and Reinforcement Learning[J].Acta Geodaetica et Cartographica Sinica,2016,45(11):1328-1334.

Authors:	SUN Wenbin XIONG Ting

Institution:	College of Geosciences and Surveying Engineering, China University of Mining and Technology(Beijing), Beijing 100083, China

Abstract:	In order to improve the accuracy of low frequency (sampling intervalgreater than 1 minute) trajectory data matching algorithm,this paper proposed a novelmatching algorithm termed HMDP-Q (History Markov Decision Processes Q-learning).The new algorithm is based on reinforced learning on historic trajectory.First,we extract historic trajectory data according to incrementalmatching algorithm as historicalreference,and filter the trajectory dataset through the historic reference,the shortest trajectory and the reachability.Then we modelthe map matchi ng process as the Markov deci si on process,and bui ld up reward function using deflected distance between trajectory points and historic trajectories.The largest reward value of the Markov decision process was calculated by using the reinforced learning algorithm, which is the optimalmatching result of trajectory and road.Finally we calibrate the algorithm by utilizing city’s floating cars data to experiment.The results show that this algorithm can improve the accuracy between trajectory data and road.The matching accuracy is 89.2%within 1 minute low-frequency sampling interval,and the matching accuracy is 61.4% when the sampling frequency is 16 minutes.Compared with IVVM(Interactive Voting-based Map Matching),HMDP-Q has a higher matching accuracy and computing efficiency.Especially,when the sampling frequency is 16 minutes,HMDP-Q improves the matching accuracy by 26%.

Keywords:	low-sampling-rate floating car data trajectory matching Markov decision process reinforcement learning
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《测绘学报》浏览原始摘要信息
	点击此处可从《测绘学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏