首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The clustering of spatio‐temporal events has become one of the most important research branches of spatio‐temporal data mining. However, the discovery of clusters of spatio‐temporal events with different shapes and densities remains a challenging problem because of the subjectivity in the choice of two critical parameters: the spatio‐temporal window for estimating the density around each event, and the density threshold for evaluating the significance of clusters. To make the clustering of spatio‐temporal events objective, in this study these two parameters were adaptively generated from statistical information about the dataset. More precisely, the density threshold was statistically modeled as an adjusted significance level controlled by the cardinality and support domain of the dataset, and the appropriate sizes of spatio‐temporal windows for clustering were determined by the spatio‐temporal classification entropy and stability analysis. Experiments on both simulated and earthquake datasets were conducted, and the results show that the proposed method can identify clusters of different shapes and densities.  相似文献   

2.
Traditional dual clustering algorithms cannot adaptively perform clustering well without sufficient prior knowledge of the dataset. This article aims at accommodating both spatial and non‐spatial attributes in detecting clusters without the need to set parameters by default or prior knowledge. A novel adaptive dual clustering algorithm (ADC+) is proposed to obtain satisfactory clustering results considering the spatial proximity and attribute similarity with the presence of noise and barriers. In this algorithm, Delaunay triangulation is utilized to adaptively obtain spatial proximity and spatial homogenous patterns based on particle swarm optimization (PSO). Then, a hierarchical clustering method is employed to obtain clusters with similar attributes. The hierarchical clustering method adopts a discriminating coefficient to adaptively control the depth of the hierarchical architecture. The clustering results are further refined using an optimization approach. The advantages and practicability of the ADC+ algorithm are illustrated by experiments on both simulated datasets and real‐world applications. It is found that the proposed ADC+ algorithm can adaptively and accurately detect clusters with arbitrary shapes, similar attributes and densities under the consideration of barriers.  相似文献   

3.
With the wide use of laser scanning technology, point cloud data collected from airborne sensors and terrestrial sensors are often integrated to depict a complete scenario from the top and ground views, even though points from different platforms and sensors have quite different densities. These massive point clouds with various structures create many problems for both data management and visualization. In this article, a hybrid spatial index method is proposed and implemented to manage and visualize integrated point cloud data from airborne and terrestrial scanners. This hybrid spatial index structure combines an extended quad‐tree model at the global level to manage large area airborne sensor data, with a 3‐D R‐tree to organize high density local area terrestrial point clouds. These massive point clouds from different platforms have diverse densities, but this hybrid spatial index system has the capability to organize the data adaptively and query efficiently, satisfying the requirements for fast visualization. Experiments using point cloud data collected from the Dunhuang area were conducted to evaluate the efficiency of our proposed method.  相似文献   

4.
针对传统空间查询无法满足地理数据交互式可视化对处理时间要求的问题,以窗口查询为例,提出了一种空间近似查询处理方法。该方法包括预处理和查询两步:在预处理阶段,利用分布化的线简化算法对空间对象进行顾及误差的预处理采样,将采样过程及误差值用树型结构保存;在查询阶段,以豪斯多夫距离定义数据可视化的误差,进行误差可知的顶点即时采样与截取,从而实现针对可视化应用的高效的空间近似查询处理。在Hadoop集群上利用77GB的OpenStreetMap数据集进行了实验,证实了本方法的效力与效率。  相似文献   

5.
New, free and fast growing spatial data sources have appeared online, based on Volunteered Geographic Information (VGI). OpenStreetMap (OSM) is one of the most representative projects of this trend. Its increasing popularity and density makes the study of its data quality an imperative. A common approach is to compare OSM with a reference dataset. In such cases, data matching is necessary for the comparison to be meaningful, and is usually performed manually at the data preparation stage. This article proposes an automated feature‐based matching method specifically designed for VGI, based on a multi‐stage approach that combines geometric and attribute constraints. It is applied to the OSM dataset using the official data from Ordnance Survey as the reference dataset. The results are then used to evaluate data completeness of OSM in several case studies in the UK.  相似文献   

6.
交通拥堵检测是城市交通管理工作的重点和难点之一,现有的拥堵检测以路段为单位,不利于拥堵时空演变规律信息的提取,且检测内容大多只涉及拥堵程度,缺少对拥堵类型的识别。基于CART(classification and regression tree)分类树算法,提出一种以路段点为检测单元的拥堵点分类检测方法,该方法可根据路段平均行驶速度实时检测拥堵点及其类型。首先,将路段等距离划分后映射为路段点,根据时空维路况异常规则和异常模式,以路段点为单元分析了4种拥堵类型的时空演变模式;其次,在路段路况检测的基础上,提取路段点路况时空序列,根据不同类型的拥堵模式对路况时空序列进行分类标记;然后,选取4种速度指标作为样本属性集合,按照属性集合提取各路段点在各时段的速度,以此作为决策树学习的数据集;最后,基于CART分类树算法,采用交叉验证的方式训练出最优模型,使其达到最佳的泛化能力。与支持向量机(support vector machine, SVM)分类模型进行比较,实验结果表明,该方法在分类检测交通拥堵点时具有较高的正确率和召回率,且分类检测时效性较好。  相似文献   

7.
Geo‐SOM is a useful geovisualization technique for revealing patterns in spatial data, but is ineffective in supporting interactive exploration of patterns hidden in different Geo‐SOM sizes. Based on the divide and group principle in geovisualization, the article proposes a new methodology that combines Geo‐SOM and hierarchical clustering to tackle this problem. Geo‐SOM was used to “divide” the dataset into several homogeneous subsets; hierarchical clustering was then used to “group” neighboring homogeneous subsets for pattern exploration in different levels of granularity, thus permitting exploration of patterns at multiple scales. An artificial dataset was used for validating the method's effectiveness. As a case study, the rush hour motorcycle flow data in Taipei City, Taiwan were analyzed. Compared with the best result generated solely by Geo‐SOM, the proposed method performed better in capturing the homogeneous zones in the artificial dataset. For the case study, the proposed method discovered six clusters with unique data and spatial patterns at different levels of granularity, while the original Geo‐SOM only identified two. Among the four hierarchical clustering methods, Ward's clustering performed the best in pattern discovery. The results demonstrated the effectiveness of the approach in visually and interactively exploring data and spatial patterns in geospatial data.  相似文献   

8.
Machine learning allows “the machine” to deduce the complex and sometimes unrecognized rules governing spatial systems, particularly topographic mapping, by exposing it to the end product. Often, the obstacle to this approach is the acquisition of many good and labeled training examples of the desired result. Such is the case with most types of natural features. To address such limitations, this research introduces GeoNat v1.0, a natural feature dataset, used to support artificial intelligence‐based mapping and automated detection of natural features under a supervised learning paradigm. The dataset was created by randomly selecting points from the U.S. Geological Survey’s Geographic Names Information System and includes approximately 200 examples each of 10 classes of natural features. Resulting data were tested in an object‐detection problem using a region‐based convolutional neural network. The object‐detection tests resulted in a 62% mean average precision as baseline results. Major challenges in developing training data in the geospatial domain, such as scale and geographical representativeness, are addressed in this article. We hope that the resulting dataset will be useful for a variety of applications and shed light on training data collection and labeling in the geospatial artificial intelligence domain.  相似文献   

9.
Mapping Large Spatial Flow Data with Hierarchical Clustering   总被引:6,自引:0,他引:6  
It is challenging to map large spatial flow data due to the problem of occlusion and cluttered display, where hundreds of thousands of flows overlap and intersect each other. Existing flow mapping approaches often aggregate flows using predetermined high‐level geographic units (e.g. states) or bundling partial flow lines that are close in space, both of which cause a significant loss or distortion of information and may miss major patterns. In this research, we developed a flow clustering method that extracts clusters of similar flows to avoid the cluttering problem, reveal abstracted flow patterns, and meanwhile preserves data resolution as much as possible. Specifically, our method extends the traditional hierarchical clustering method to aggregate and map large flow data. The new method considers both origins and destinations in determining the similarity of two flows, which ensures that a flow cluster represents flows from similar origins to similar destinations and thus minimizes information loss during aggregation. With the spatial index and search algorithm, the new method is scalable to large flow data sets. As a hierarchical method, it generalizes flows to different hierarchical levels and has the potential to support multi‐resolution flow mapping. Different distance definitions can be incorporated to adapt to uneven spatial distribution of flows and detect flow clusters of different densities. To assess the quality and fidelity of flow clusters and flow maps, we carry out a case study to analyze a data set of 243,850 taxi trips within an urban area.  相似文献   

10.
Density‐based clustering algorithms such as DBSCAN have been widely used for spatial knowledge discovery as they offer several key advantages compared with other clustering algorithms. They can discover clusters with arbitrary shapes, are robust to noise, and do not require prior knowledge (or estimation) of the number of clusters. The idea of using a scan circle centered at each point with a search radius Eps to find at least MinPts points as a criterion for deriving local density is easily understandable and sufficient for exploring isotropic spatial point patterns. However, there are many cases that cannot be adequately captured this way, particularly if they involve linear features or shapes with a continuously changing density, such as a spiral. In such cases, DBSCAN tends to either create an increasing number of small clusters or add noise points into large clusters. Therefore, in this article, we propose a novel anisotropic density‐based clustering algorithm (ADCN). To motivate our work, we introduce synthetic and real‐world cases that cannot be handled sufficiently by DBSCAN (or OPTICS). We then present our clustering algorithm and test it with a wide range of cases. We demonstrate that our algorithm can perform equally as well as DBSCAN in cases that do not benefit explicitly from an anisotropic perspective, and that it outperforms DBSCAN in cases that do. Finally, we show that our approach has the same time complexity as DBSCAN and OPTICS, namely O(n log n) when using a spatial index and O(n2) otherwise. We provide an implementation and test the runtime over multiple cases.  相似文献   

11.
Discovering Spatial Interaction Communities from Mobile Phone Data   总被引:4,自引:0,他引:4  
In the age of Big Data, the widespread use of location‐awareness technologies has made it possible to collect spatio‐temporal interaction data for analyzing flow patterns in both physical space and cyberspace. This research attempts to explore and interpret patterns embedded in the network of phone‐call interaction and the network of phone‐users’ movements, by considering the geographical context of mobile phone cells. We adopt an agglomerative clustering algorithm based on a Newman‐Girvan modularity metric and propose an alternative modularity function incorporating a gravity model to discover the clustering structures of spatial‐interaction communities using a mobile phone dataset from one week in a city in China. The results verify the distance decay effect and spatial continuity that control the process of partitioning phone‐call interaction, which indicates that people tend to communicate within a spatial‐proximity community. Furthermore, we discover that a high correlation exists between phone‐users’ movements in physical space and phone‐call interaction in cyberspace. Our approach presents a combined qualitative‐quantitative framework to identify clusters and interaction patterns, and explains how geographical context influences communities of callers and receivers. The findings of this empirical study are valuable for urban structure studies as well as for the detection of communities in spatial networks.  相似文献   

12.
Due to high data volume, massive spatial data requires considerable computing power for real‐time processing. Currently, high performance clusters are the only economically viable solution given the development of multicore technology and computer component cost reduction in recent years. Massive spatial data processing demands heavy I/O operations, however, and should be characterized as a data‐intensive application. Data‐intensive application parallelization strategies, such as decomposition, scheduling and load‐balance, are much different from that of traditional compute‐intensive applications. In this article we introduce a Split‐and‐Merge paradigm for spatial data processing and also propose a robust parallel framework in a cluster environment to support this paradigm. The Split‐and‐Merge paradigm efficiently exploits data parallelism for massive data processing. The proposed framework is based on the open‐source TORQUE project and hosted on a multicore‐enabled Linux cluster. A specific data‐aware scheduling algorithm was designed to exploit data sharing between tasks and decrease the data communication time. Two LiDAR point cloud algorithms, IDW interpolation and Delaunay triangulation, were implemented on the proposed framework to evaluate its efficiency and scalability. Experimental results demonstrate that the system provides efficient performance speedup.  相似文献   

13.
刘晓云  陈武凡  王振松 《测绘学报》2007,36(4):400-405,442
有限混合模型FM的分级聚类已广泛应用于不同领域,然而,由于它的计算复杂度与观测数据量平方成正比,致使在遥感影像方面应用受到了限制。另外,多光谱图像能提供空间和光谱两类信息详细的数据,但是,大多数多光谱图像聚类方法是基于像素的聚类,仅使用了其光谱信息而忽视了空间信息。本文定义一个相对混合密度函数,通过引入一个q-参数来调节各成分密度对其混合分布的贡献,提出一种广义有限混合模型GFM.设计一种新的适用于多光谱遥感影像的GFM分级聚类算法。该算法把MRF随机场和GFM模型结合在了一起,分类数通过PLIC准则自动确定。最后,利用仿真结果验证该算法的有效性,同时通过与K均值聚类、FM分级聚类以及SVMM分级聚类的比较说明本文算法的优越性。  相似文献   

14.
With fast growth of all kinds of trajectory datasets, how to effectively manage the trajectory data of moving objects has received a lot of attention. This study proposes a spatio‐temporal data integrated compression method of vehicle trajectories based on stroke paths coding compression under the road stroke network constraint. The road stroke network is first constructed according to the principle of continuous coherence in Gestalt psychology, and then two types of Huffman tree—a road strokes Huffman tree and a stroke paths Huffman tree—are built, based respectively on the importance function of road strokes and vehicle visiting frequency of stroke paths. After the vehicle trajectories are map matched to the spatial paths in the road network, the Huffman codes of the road strokes and stroke paths are used to compress the trajectory spatial paths. An opening window algorithm is used to simplify the trajectory temporal data depicted on a time–distance polyline by setting the maximum allowable speed difference as the threshold. Through analysis of the relative spatio‐temporal relationship between the preceding and latter feature tracking points, the spatio‐temporal data of the feature tracking points are all converted to binary codes together, accordingly achieving integrated compression of trajectory spatio‐temporal data. A series of comparative experiments between the proposed method and representative state‐of‐the‐art methods are carried out on a real massive taxi trajectory dataset from five aspects, and the experimental results indicate that our method has the highest compression ratio. Meanwhile, this method also has favorable performance in other aspects: compression and decompression time overhead, storage space overhead, and historical dataset training time overhead.  相似文献   

15.
采用聚类技术探测空间异常   总被引:1,自引:0,他引:1  
邓敏  刘启亮  李光强 《遥感学报》2010,14(5):951-965
提出了一种基于聚类的空间异常探测方法。该方法通过空间聚类获得局部相关性较强的实体集合,分别探测空间异常,给出了一种稳健的空间异常度量指标,提高了异常探测结果的可靠性。通过实例验证以及与SOM方法的比较分析,证明了该方法的正确性和优越性。  相似文献   

16.
In this article, multilayer perceptron (MLP) network models with spatial constraints are proposed for regionalization of geostatistical point data based on multivariate homogeneity measures. The study focuses on non‐stationarity and autocorrelation in spatial data. Supervised MLP machine learning algorithms with spatial constraints have been implemented and tested on a point dataset. MLP spatially weighted classification models and an MLP contiguity‐constrained classification model are developed to conduct spatially constrained regionalization. The proposed methods have been tested with an attribute‐rich point dataset of geological surveys in Ukraine. The experiments show that consideration of the spatial effects, such as the use of spatial attributes and their respective whitening, improve the output of regionalization. It is also shown that spatial sorting used to preserve spatial contiguity leads to improved regionalization performance.  相似文献   

17.
Spatial anomalies may be single points or small regions whose non‐spatial attribute values are significantly inconsistent with those of their spatial neighborhoods. In this article, a S patial A nomaly P oints and R egions D etection method using multi‐constrained graphs and local density ( SAPRD for short) is proposed. The SAPRD algorithm first models spatial proximity relationships between spatial entities by constructing a Delaunay triangulation, the edges of which provide certain statistical characteristics. By considering the difference in non‐spatial attributes of adjacent spatial entities, two levels of non‐spatial attribute distance constraints are imposed to improve the proximity graph. This produces a series of sub‐graphs, and those with very few entities are identified as candidate spatial anomalies. Moreover, the spatial anomaly degree of each entity is calculated based on the local density. A spatial interpolation surface of the spatial anomaly degree is generated using the inverse distance weight, and this is utilized to reveal potential spatial anomalies and reflect their whole areal distribution. Experiments on both simulated and real‐life spatial databases demonstrate the effectiveness and practicability of the SAPRD algorithm.  相似文献   

18.
高分辨率遥感影像建筑区域局部几何特征提取   总被引:1,自引:0,他引:1  
及时准确地获取城市建筑区域的空间分布及其变化信息对于城市规划、空间地理数据库建设及区域社会经济分析具有重要意义。本文提出一种基于多尺度Gabor变换和感知聚类方法即张量投票TV (Tensor Voting)相结合的自适应局部几何不变特征检测方法,并将其应用于高空间分辨率遥感影像建筑区域提取。首先,考虑到高分辨率遥感影像复杂的几何结构特征,使用Gabor滤波器组对影像进行多尺度多方向变换检测奇异性特征。然后,在感知聚类框架下,根据张量投票理论将不同方向子带系数位置编码为相应的二阶对称方向张量,为了突出影像几何特征,对不同尺度、不同方向子带中任意像素位置方向张量使用滤波器响应系数加权并求和完成多尺度特征融合。再次,对张量特征分解得到点结构与线结构显著性图并使用非极大抑制提取相应角点和曲线等局部几何特征,同时生成约束准则筛选角点以确定建筑物坐标。最后,利用概率密度估计结合局部角点特征生成全局概率密度场描述影像中像素从属于建筑目标的概率,并使用最大类间方差法(Otsu)阈值分割自动提取居民地多边形区域。使用分辨率分别为0.49 m、0.98 m的Google Earth及0.8 m的高分二号等影像数据集进行实验,实验结果表明本文方法相对于已有的Harris和HSCD点检测算法,在建筑区域提取质量上(Quality)上分别提高了4.79%,5.96%;1.47%,3.76%和1.91%,4.08%。  相似文献   

19.
Laser scanning systems have been established as leading tools for the collection of high density three-dimensional data over physical surfaces. The collected point cloud does not provide semantic information about the characteristics of the scanned surfaces. Therefore, different processing techniques have been developed for the extraction of useful information from this data which could be applied for diverse civil, industrial, and military applications. Planar and linear/cylindrical features are among the most important primitive information to be extracted from laser scanning data, especially those collected in urban areas. This paper introduces a new approach for the identification, parameterization, and segmentation of these features from laser scanning data while considering the internal characteristics of the utilized point cloud – i.e., local point density variation and noise level in the dataset. In the first step of this approach, a Principal Component Analysis of the local neighborhood of individual points is implemented to identify the points that belong to planar and linear/cylindrical features and select their appropriate representation model. For the detected planar features, the segmentation attributes are then computed through an adaptive cylinder neighborhood definition. Two clustering approaches are then introduced to segment and extract individual planar features in the reconstructed parameter domain. For the linear/cylindrical features, their directional and positional parameters are utilized as the segmentation attributes. A sequential clustering technique is proposed to isolate the points which belong to individual linear/cylindrical features through directional and positional attribute subspaces. Experimental results from simulated and real datasets demonstrate the feasibility of the proposed approach for the extraction of planar and linear/cylindrical features from laser scanning data.  相似文献   

20.
Mobility and spatial interaction data have become increasingly available due to the wide adoption of location‐aware technologies. Examples of mobility data include human daily activities, vehicle trajectories, and animal movements, among others. In this article we focus on a special type of mobility data, i.e. origin‐destination pairs, and present a new approach to the discovery and understanding of spatio‐temporal patterns in the movements. Specifically, to extract information from complex connections among a large number of point locations, the approach involves two steps: (1) spatial clustering of massive GPS points to recognize potentially meaningful places; and (2) extraction and mapping of the flow measures of clusters to understand the spatial distribution and temporal trends of movements. We present a case study with a large dataset of taxi trajectories in Shenzhen, China to demonstrate and evaluate the methodology. The contribution of the research is two‐fold. First, it presents a new methodology for detecting location patterns and spatial structures embedded in origin‐destination movements. Second, the approach is scalable to large data sets and can summarize massive data to facilitate pattern extraction and understanding.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号