首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于多源地理大数据与机器学习的地铁乘客出行目的识别方法
引用本文:赵鹏军,曹毓书.基于多源地理大数据与机器学习的地铁乘客出行目的识别方法[J].地球信息科学,2020,22(9):1753-1765.
作者姓名:赵鹏军  曹毓书
作者单位:北京大学城市与环境学院城市规划与交通研究中心,北京100871
基金项目:国家自然科学基金项目(41925003);英国研究理事会全球挑战基金项目(R48843)
摘    要:探索地铁乘客出行目的识别方法,有助于突破智能卡数据(Smart Card Data,SCD)在具体应用场景中的局限性,提升SCD在交通出行研究、交通发展规划等领域的应用价值。本文融合多源地理大数据,基于城市交通与土地利用时空间互动理论,以北京市居民地铁出行为例,在交通出行调查数据中提取5565个地铁出行样本及其对应的出行目的和出行特征相关变量。基于兴趣点(Point of Interest,POI)数据得到各样本起止站点的土地利用特征相关变量,形成包含每次地铁出行的出行目的、出行特征、土地利用特征的地铁出行数据集。使用基于随机森林(Random Forest,RF)算法对地铁出行数据集进行训练完成的分类器对SCD记录的每一次地铁出行进行分类,获得该次出行的出行目的及其不同目的地铁出行时空间分布规律。研究结果表明,本识别方法可有效预测地铁乘客的出行目的,其中,“上班”、“回家”2类出行目的的预测准确率均超过90%;纳入土地利用特征相关变量可显著提升RF分类器预测准确率,印证了城市交通与土地利用的时空间互动理论。鉴于当前SCD的可获取性逐渐提高,该项技术在居民地铁出行监测与预测、地铁线网布局和地铁周边土地利用规划等实践方面,具有很强的推广性,有助于更全面地认知大城市居民的地铁出行行为。

关 键 词:地铁出行  出行目的识别  交通调查数据  智能卡数据  兴趣点数据  随机森林  土地利用  时空间互动  北京  
收稿时间:2019-03-22

Identifying Metro Trip Purpose using Multi-source Geographic Big Data and Machine Learning Approach
ZHAO Pengjun,CAO Yushu.Identifying Metro Trip Purpose using Multi-source Geographic Big Data and Machine Learning Approach[J].Geo-information Science,2020,22(9):1753-1765.
Authors:ZHAO Pengjun  CAO Yushu
Institution:The Centre for Urban Planning and Transport Studies, College of Urban and Environmental Sciences, Peking University, Beijing 100871, China
Abstract:Identifying metro trip purpose using Smart Card Data (SCD) is important to expand the application of SCD in transport research and transport planning. This paper integrates different types of big data and combines the theories on the interaction between transport and land use. By taking Beijing as a case, we firstly analyze the metro trip purposes of individual passengers using travel survey data from 5565 respondents. Secondly, we investigate the land use features of trip origin and destination using Point of Interest(POI) data . Thirdly, a metro trip dataset is developed which includes the information of trip purpose, trip duration, and spatial distribution of trip origin and destination. Fourthly, a Random Forest (RF) algorithm is used to establish a RF classifier using the metro trip dataset as training data. Finally, this trained classifier is used to classify each metro trip recorded by the SCD to identify the metro trip purpose and the spatial distribution of metro trips for different purposes. The results of analysis show that the random forest classifier trained in this study can effectively identify metro trip purposes from SCD. For trips with "go to work" and "go home" purposes, the accuracy of identification can reach over 90%. One reason for the high identification accuracy is that land use information is included in the RF classifier. Our results confirm the theory of spatial-temporal interactions between transport and land use. There is an increasing availability of multi-source geographic big data and traffic survey data of residents in large cities, which means that the method developed in this study would have a high value in metro trip predicting and monitoring, transport planning, and land use policy-making around the metro stations. Also, our results enhance our knowledge of metro travel behavior in megacities.
Keywords:Metro trips  trip purpose  travel survey data  smart card data  point of interest data  Random Forest algorithm  land use  spatial-temporal interactions  Beijing  
点击此处可从《地球信息科学》浏览原始摘要信息
点击此处可从《地球信息科学》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号