首页 | 本学科首页   官方微博 | 高级检索  

引用本文:聂沛,陈广胜,景维鹏. 矢量瓦片并行构建与分布式存储模型研究[J]. 地球信息科学学报, 2020, 22(7): 1487-1496. DOI: 10.12082/dqxxkx.2020.190255
作者姓名:聂沛  陈广胜  景维鹏
作者单位:1.东北林业大学信息与计算机工程学院,哈尔滨 150040;2.黑龙江省林业生态大数据存储与高性能(云)计算工程研究中心,哈尔滨 150040
摘    要:矢量瓦片体积小、生成效率高、支持动态交互,较传统栅格瓦片有诸多优势,是下一代互联网地图服务研究的重点。为了解决当前矢量瓦片研究中处理速度慢,扩展性差等问题,本文利用并行计算框架Spark进行矢量瓦片快速构建,通过自定义转换函数,将原始矢量数据GeoJson转换成mvt瓦片集;对于生成的矢量瓦片集,本文基于分布式内存文件系统Alluxio设计一个瓦片存储模型-VectorTileStore,模型以键值对进行数据存储,瓦片元数据占据前八个键值对,单个瓦片占据一个键值对,在数据写入的同时,基于键构建一个哈希索引,用于快速访问,模型兼容海量瓦片的组织存储,具有很强的扩展性。通过实验结果表明,本文提出的矢量瓦片并行构建算法较单机构建算法运行时间平均减少49.6%,分布式存储模型VectorTileStore较传统方案更适合海量矢量瓦片存储,存取时间效率更高。

关 键 词:矢量瓦片  web地图服务  并行处理  Spark  分布式存储  Alluxio  

Parallel Construction and Distributed Storage for Vector Tile
NIE Pei,CHEN Guangsheng,JING Weipeng. Parallel Construction and Distributed Storage for Vector Tile[J]. Geo-information Science, 2020, 22(7): 1487-1496. DOI: 10.12082/dqxxkx.2020.190255
Authors:NIE Pei  CHEN Guangsheng  JING Weipeng
Affiliation:1. College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China;2. Heilongjiang Province Engineering Technology Research Center for Forestry Ecological Big Data Storage and High Performance(Cloud) Computing,Harbin 150040, China
Abstract:With the deepening of the information technology, Internet maps containing multi-source geospatial information are widely used in many fields such as forestry, ocean, land, transportation, and military. At the same time, due to the advancement of Earth observation, surveying, and mapping technology, spatial data with high precision and wide coverage has grown rapidly, leading to an era of geospatial big data. Under this background, how to quickly and efficiently construct Internet map services becomes the current research priorities and challenges. Grid tiles has been used to construct Internet maps at the beginning, and played an important role in the fast-growing popularity of Internet maps. However, with the mobilization of maps and the gradual deepening of applications, the disadvantages of large size and low efficiency of applying grid tiles are becoming more and more obvious, which is difficult to meet the needs of applications. Vector tiles have many advantages over traditional grid tiles, such as small in size, high in generation efficiency, and support dynamic interaction, are becoming the focus of next generation Internet map service research. In order to further accelerate the processing speed and enhance the scalability in current vector tile application, this study uses big data technology for vector tile processing. Firstly, we uses the parallel computing framework-Spark, to build the vector tile pyramid model. Specifically, through customizing the Spark conversion function, the steps of tile generation are parallelized, and the original vector data GeoJson is converted into a vector tile set-MapBox Vector Tile (Mvt). Then we designs a tile storage model-VectorTile Store, to store the generated Mvt based on the distributed memory filesystem-Alluxio. The VectorTile Store model stores data with key-value pairs, with the tile metadata occupying the first eight key-value pairs, and each single tile occupying a key-value pair. When the data is being written, a hash index is built based on the key for fast access. This model efficiently stores massive tiles and is highly scalable. The experimental results show that the vector tile parallel construction algorithm and distributed storage model proposed in this paper are more efficient than traditional schemes, and are more suitable for massive vector tile data processing.
Keywords:vector tile  web map service  parallel processing  spark  distributed storage  alluxio  
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号