首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
In stream sediment and soil surveys, samples represent mixtures of components from different geological environments. Such mixed samples are misclassified when using conventional “hard” cluster methods. In fuzzy clustering, each sample is allowed to belong to several clusters. Similar to element concentrations, these cluster contributions can be displayed in contour maps (e.g. kriging maps). The amount of an element that is explained by the cluster contribution and element residuals can be calculated. The modified fuzzy clustering algorithm called “limited fuzzy clusters” used in this paper avoids negative residuals.Stream sediment data of Sierra de San Carlos, Tamaulipas, Mexico are used to demonstrate the possibilities of limited fuzzy clustering in geochemical exploration and mapping. From the different drainage systems, 681 stream sediment samples were taken and analyzed for 24 elements. A nineteen-element data set was used to calculate limited fuzzy clusters and element residuals. The contribution values for the clusters and element residuals are displayed in contour maps. All geological units were outlined by the cluster contributions. Extended anomalies are characterized by their own cluster. Small anomalies are clearly identified from the element residuals.  相似文献   

2.
On Distance Measures for the Fuzzy K-means Algorithm for Joint Data   总被引:7,自引:0,他引:7  
Summary  The analysis of data collected on rock discontinuities often requires that the data be separated into joint sets or groups. A statistical tool that facilitates the automatic identification of groups of clusters of observations in a data set is cluster analysis. The fuzzy K-means cluster technique has been successfully applied to the analysis of joint survey data. As is the case with all clustering algorithms, the results of an analysis performed with the fuzzy K-means algorithm for discontinuity data are highly dependent on the distance metric employed in the analysis. This paper explores the significant issues surrounding the choice and use of various distance measures for clustering joint survey data. It also proposes an analogue of the Mahalanobis distance norm (used for data in Euclidean space) for clustering spherical data. Sample applications showing the greater flexibility and power of the new distance measure over the originally proposed distance metric for spherical data are given in the paper.  相似文献   

3.
Zones of mixing between shallow groundwaters of different composition were unravelled by “two-way regionalized classification,” a technique based on correspondence analysis (CA), cluster analysis (ClA) and discriminant analysis (DA), aided by gridding, map-overlay and contouring tools. The shallow groundwaters are from a granitoid plutonite in the Fundão region (central Portugal). Correspondence analysis detected three natural clusters in the working dataset: 1, weathering; 2, domestic effluents; 3, fertilizers. Cluster analysis set an alternative distribution of the samples by the three clusters. Group memberships obtained by correspondence analysis and by cluster analysis were optimized by discriminant analysis, gridded over the entire Fundão region, and converted into “two-way regionalized classification” memberships as follows: codes 1, 2 or 3 were used when classification by correspondence analysis and cluster analysis produced the same results; code 0 when the grid node was first assigned to cluster 1 and then to cluster 2 or vice versa (mixing between weathering and effluents); code 4 in the other cases (mixing between agriculture and the other influences). Code-3 areas were systematically surrounded by code-4 areas, an observation attributed to hydrodynamic dispersion. Accordingly, the extent of code-4 areas in two orthogonal directions was assumed proportional to the longitudinal and transverse dispersivities of local soils. The results (0.7–16.8 and 0.4–4.3 m, respectively) are acceptable at the macroscopic scale. The ratios between longitudinal and transverse dispersivities (1.2–11.1) are also in agreement with results obtained by other studies.  相似文献   

4.
Geochemical exploration in secondary environments can be viewed as a particular manifestation of indirect geological observation. Geochemical anomalies in complex sample media reflect dispersion signatures, generally much disguised by secondary or higher-order mechanical and physico-chemical processes such as mixing, comminution, dilution, (re)transportation, weathering etc. Such complexities often make a thorough understanding of the origin of any particular sample type difficult ot obtain. The objective of data analysis in this context is to convert the geochemical data into a meaningful “signal”, particularly useful for prospecting, and other, in this case irrelevant, variability or “noise”. The experience of the last decades of practical exploration has clearly shown that statistical as well as geographical geochemical anomaly patterns are multi-element signatures. Using suitable multivariate statistical procedures (in the present case principal components modelling), it is possible to simultaneously define both a background data model and to quantify multivariate geochemical anomalies. This type of data analysis is guided very strongly by geological interaction, in which the emphasis is on modelling the background population(s), coupled with geographic plotting facilities. This outlier-screening facility is critical for many types of geochemical data evaluation. An example of this approach is described below. Another application of indirect multivariate data analysis is represented by PLS (Partial Least Squares) regression, which is a supervised pattern recognition and regression technique. We use it here to predict modal scheelite occurrences from regional stream-sediment data.  相似文献   

5.
A robust classification scheme for partitioning water chemistry samples into homogeneous groups is an important tool for the characterization of hydrologic systems. In this paper we test the performance of the many available graphical and statistical methodologies used to classify water samples including: Collins bar diagram, pie diagram, Stiff pattern diagram, Schoeller plot, Piper diagram, Q-mode hierarchical cluster analysis, K-means clustering, principal components analysis, and fuzzy k-means clustering. All the methods are discussed and compared as to their ability to cluster, ease of use, and ease of interpretation. In addition, several issues related to data preparation, database editing, data-gap filling, data screening, and data quality assurance are discussed and a database construction methodology is presented. The use of graphical techniques proved to have limitations compared with the multivariate methods for large data sets. Principal components analysis is useful for data reduction and to assess the continuity/overlap of clusters or clustering/similarities in the data. The most efficient grouping was achieved by statistical clustering techniques. However, these techniques do not provide information on the chemistry of the statistical groups. The combination of graphical and statistical techniques provides a consistent and objective means to classify large numbers of samples while retaining the ease of classic graphical presentations. Electronic Publication  相似文献   

6.
Changes in the stress field of an aquifer system induced by seismotectonic activity may change the mixing ratio of groundwaters with different compositions in a well, leading to hydrochemical signals which in principle could be related to discrete earthquake events. Due to the complexity of the interactions and the multitude of involved factors the identification of such relationships is a difficult task. In this study we present an empiric statistical approach suitable to analyse if there is an interdependency between changes in the chemical composition of monitoring wells and the regional seismotectonic activity of a considered area. To allow a rigorous comparison with hydrochemistry the regional earthquake time series was aggregated into an univariate time series. This was realized by expressing each earthquake in form of a parameter “e”, taking into consideration both energetic (magnitude of a seismic event) and spatial parameters (position of epi/hypocentrum relative to the monitoring site). The earthquake and the hydrochemical time-series were synchronised aggregating the e-parameters into “earthquake activity” functions E, which takes into account the time of sampling relative to the earthquakes which occurred in the considered area. For the definition of the aggregation functions a variety of different “e” parameters were considered. The set of earthquake functions E was grouped by means of factor analysis to select a limited number of significant and representative earthquake functions E to be used further on in the relation analysis with the multivariate hydrochemical data set. From the hydrochemical data a restricted number of hydrochemical factors were extracted. Factor scores allow to represent and analyse the variation of the hydrochemical factors as a function of time. Finally, regression analysis was used to detect those hydrochemical factors which significantly correlate with the aggregated earthquake functions.This methodological approach was tested with a hydrochemical data set collected from a deep well monitored for two years in the seismically active Vrancea region, Romania. Three of the hydrochemical factors were found to correlate significantly with the considered earthquake activities. A screening with different time combinations revealed that correlations are strongest when the cumulative seismicity over several weeks was considered. The case study also showed that the character of the interdependency depends sometimes on the geometrical distribution of the earthquake foci. By using aggregated earthquake information it was possible to detect interrelationships which couldn't have been identified by analysing only relations between single geochemical signals and single earthquake events. Further on, the approach allows to determine the influence of different seismotectonic patterns on the hydrochemical composition of the sampled well. The method is suitable to be used as a decision instrument in assessing if a monitoring site is suitable or not to be included in a monitoring net within a complex earthquake prediction strategy.  相似文献   

7.
The threshold between geochemical background and anomalies can be influenced by the methodology selected for its estimation. Environmental evaluations, particularly those conducted in mineralized areas, must consider this when trying to determinate the natural geochemical status of a study area, quantifying human impacts, or establishing soil restoration values for contaminated sites. Some methods in environmental geochemistry incorporate the premise that anomalies (natural or anthropogenic) and background data are characterized by their own probabilistic distributions. One of these methods uses exploratory data analysis (EDA) on regional geochemical data sets coupled with a geographic information system (GIS) to spatially understand the processes that influence the geochemical landscape in a technique that can be called a spatial data analysis (SDA). This EDA–SDA methodology was used to establish the regional background range from the area of Catorce–Matehuala in north-central Mexico. Probability plots of the data, particularly for those areas affected by human activities, show that the regional geochemical background population is composed of smaller subpopulations associated with factors such as soil type and parent material. This paper demonstrates that the EDA–SDA method offers more certainty in defining thresholds between geochemical background and anomaly than a numeric technique, making it a useful tool for regional geochemical landscape analysis and environmental geochemistry studies.  相似文献   

8.
Various marbles from both historic quarries and historical artefacts of the Czech Republic were examined in order to make determinations of their provenance. The methodology used was based upon a combination of petrographic image analysis (PIA) of thin sections, stable isotope geochemistry of carbonates, and cathodoluminescence. Multivariate statistical methods (i.e. cluster analysis and discriminant analysis) confirmed the geoscientific relevance of the marble’s different characteristics with a high degree of consistency as well as the enhanced significance of stable C and O isotopes in correlation with the petrographic data. The qualitative cathodoluminescence data provided a useful additional tool to help recognise the fingerprinting of marbles with similar petrographic and/or geochemical characteristics.  相似文献   

9.
Interpretation of regional scale, multivariate geochemical data is aided by a statistical technique called “clustering.” We investigate a particular clustering procedure by applying it to geochemical data collected in the State of Colorado, United States of America. The clustering procedure partitions the field samples for the entire survey area into two clusters. The field samples in each cluster are partitioned again to create two subclusters, and so on. This manual procedure generates a hierarchy of clusters, and the different levels of the hierarchy show geochemical and geological processes occurring at different spatial scales. Although there are many different clustering methods, we use Bayesian finite mixture modeling with two probability distributions, which yields two clusters. The model parameters are estimated with Hamiltonian Monte Carlo sampling of the posterior probability density function, which usually has multiple modes. Each mode has its own set of model parameters; each set is checked to ensure that it is consistent both with the data and with independent geologic knowledge. The set of model parameters that is most consistent with the independent geologic knowledge is selected for detailed interpretation and partitioning of the field samples.  相似文献   

10.
《Applied Geochemistry》2002,17(3):185-206
A large regional geochemical data set of C-horizon podzol samples from a 188,000 km2 area in the European Arctic, analysed for more than 50 elements, was used to test the influence of different variants of factor analysis on the results extracted. Due to the nature of regional geochemical data (neither normal nor log-normal, strongly skewed, often multi-modal data distributions), the simplest methods of factor analysis with the least statistical assumptions perform best. As a result of this test it can generally be suggested to use principal factor analysis with an orthogonal rotation for such data. Selecting the number of factors to extract is difficult, however, the scree plot provides some useful help. For the test data, a low number of extracted factors gave the most informative results. Deleting or adding just 1 element in the input matrix can drastically change the results of factor analysis. Given that selection of elements is often rather based on availability of analytical packages (or detection limits) than on geochemical reasoning this is a disturbing result. Factor analysis revealed the most interesting data structures when a low number of variables were entered. A graphical presentation of the loadings and a simple, automated mapping technique allows extraction of the most interesting results of different factor analyses in one glance. Results presented here underline the importance of careful univariate data analysis prior to entering factor analysis. Outliers should be removed from the dataset and different populations present in the data should be treated separately. Factor analysis can be used to explore a large data set for hidden multivariate data structures.  相似文献   

11.
Mine site characterization often results in the acquisition of geological, geotechnical and hydrogeological data sets that are used in the mine design process but are rarely co-evaluated. For a study site in northern Canada, bivariate and multivariate (hierarchical) statistical techniques are used to evaluate empirical hydraulic conductivity estimation methods based on traditional rock mass characterisation schemes, as well as to assess the regional hydrogeological conceptual model. Bivariate techniques demonstrate that standard geotechnical measures of fracturing are poor indicators of the hydraulic potential of a rock mass at the study site. Additionally, rock-mass-permeability schemes which rely on these measures are shown to be poor predictors of hydraulic conductivity in untested areas. Multivariate techniques employing hierarchical cluster analysis of both geotechnical and geological data sets are able to identify general trends in the data. Specifically, the geological cluster analysis demonstrated spatial relationship between intrusive contacts and increased hydraulic conductivity. This suggests promise in the use of clustering methods in identifying new trends during the early stages of hydrogeological characterization.  相似文献   

12.
The study of hydrogeochemistry of the Mio-Pliocene sedimentary rock aquifer system in Veeranam catchment area produced a large geochemical dataset. Groundwater samples were collected at 52 sites over 963.86 km2 area and analyzed for major ions. The large number of data can lead to difficulties in the integration, interpretation and representation of the results. Two multivariate statistical methods, Hierarchical cluster analysis (HCA) and Factor analysis (FA), were applied to a subgroup of the dataset to evaluate their usefulness to classify the groundwater samples, and to identify geochemical processes controlling groundwater geochemistry. Hydrochemical data for 52 groundwater samples were subjected to Q- and R- mode factor and cluster analysis. R-mode analysis reveals the inter-relations among the variables studied and the Q-mode analysis reveals the inter-relations among the samples studied. The R-mode factor analysis shows that Ca, Mg and Cl with HCO3 account for most of the electrical conductivity, total dissolved solids and total hardness of groundwater. The ‘single dominance’ nature of the majority of the factors in the R-mode analysis indicates non-mixing or partial mixing of different types of groundwater. Both Q-mode factor and Q-mode cluster analyses indicate an exchange between the river water and the groundwater in the vicinity. The rock water interaction like flood basin back swamp deposits of silty clayey formation is the major cause for the cluster II classification. Cluster classification map reveals that 58% of the study area comes under cluster II classification.  相似文献   

13.
利用Excel实现R型聚类分析   总被引:2,自引:0,他引:2  
春乃芽 《物探与化探》2007,31(4):374-376
R型聚类分析是对若干个元素进行数量化相似程度分类的一种数理统计方法,主要步骤包括:原始数据转换;求解相关系数;对结果聚类。利用Excel的数据分析工具实现R型聚类分析的方法和步骤,对野外一线地质人员的工作相当适用。  相似文献   

14.

This paper offers a new method for the definition of geotechnical sectors in open pit mines based on multivariate cluster analysis. A geological-geotechnical data set of a manganese open pit mine was used to demonstrate the methodology. The data set consists of a survey of geological and geotechnical parameters of the rock mass, measured directly in several points of the mine, structured initially in twenty-eight variables. After the preprocessing of the data set, the clustering technique was applied using the k-Prototype algorithm. The squared Euclidean distance was used to quantify the proximity between numerical variables, and the Jaccard's coefficient of similarity was used to quantify the proximity between the nominal variables. The different cluster results obtained were validated by the multivariate analysis of variance. The identification of cluster structures was achieved by plotting them on the mine map for spatial visualization and definition of geotechnical sectors. These sectors are spatially contiguous and relatively homogeneous regarding their geological–geotechnical properties, indicated by a high density of points of the same group. It was possible to observe a great adherence of the proposed sectors to the mine geology, demonstrating the practical representativeness of the clustering results and the proposed sectors.

  相似文献   

15.
In this study, an artificial neural network model was developed to predict storm surges in all Korean coastal regions, with a particular focus on regional extension. The cluster neural network model (CL-NN) assessed each cluster using a cluster analysis methodology. Agglomerative clustering was used to determine the optimal clustering of 21 stations, based on a centroid-linkage method of hierarchical clustering. Finally, CL-NN was used to predict storm surges in cluster regions. In order to validate model results, sea levels predicted by the CL-NN model were compared with results using conventional harmonic analysis and the artificial neural network model in each region (NN). The values predicted by the NN and CL-NN models were closer to observed data than values predicted using harmonic analysis. Data such as root mean square error and correlation coefficient varied only slightly between CL-NN and NN model results. These findings demonstrate that cluster analysis and the CL-NN model can be used to predict regional storm surges and may be used to develop a forecast system.  相似文献   

16.
Worldwide analysis of the clustering of earthquakes has lead to the hypothesis that the occurrence of abnormally large clusters indicates an increase in probability of a strong earthquake in the next 3–4 years within the same region. Three long-term premonitory seismicity patterns, which correspond to different non-contradictory definitions of abnormally large clusters, were tested retrospectively in 15 regions. The results of the tests suggest that about 80% of the strongest earthquakes can be predicted by monitoring these patterns.Most of results concern pattern B (“burst of aftershocks”) i.e. an earthquake of medium magnitude with an abnormally large number of aftershocks during the first few days. Two other patterns, S and Σ often complement pattern B and can replace it in some regions where the catalogs show very few aftershocks.The practical application of these patterns is strongly limited by the fact that neither the location of the coming earthquake within the region nor its time of occurrence within 3–4 years is indicated. However, these patterns present the possibility of increasing the reliability of medium and short-term precursors; also, they allow activation of some important early preparatory measures.The results impose the following empirical constraint on the theory of the generation of a strong earthquake: it is preceded by abnormal clustering of weaker earthquakes in the space-time-energy domain; corresponding clusters are few but may occur in a wide region around the location of the coming strong earthquake; the distances are of the same order as for the other reported precursors.  相似文献   

17.
As part of a research program conducted on behalf of the Department of Energy, available data on the Roosevelt Springs KGRA were synthesized to determine the spatial arrangement of the rocks, and the patterns of mass and energy flow within them. The resulting model led to a new interpretation of the geothermal system, and provided “ground truth” for evaluating the application of soil geochemistry to exploration for concealed geothermal fields. Preliminary geochemical studies comparing the surface micro-layer to conventional soil sampling methods indicated both practical and chemical advantages for the surface micro-layer technique.The elements arsenic, antimony and cesium in the surface microlayer samples in particular, gave a strong expression of one of the principal faults in the geothermal field. In contrast the analysis of soil samples from only 20 cm below the surface gave little or no expression of the geothermal field.As a consequence, the surface micro-layer was the chosen sampling medium for the second field program, which entailed the collection of a total of some 300 samples on both a regional and detailed pattern covering about 250 km2. These samples were subsequently analyzed by a variety of methods yielding data on 41 elements and ions.Computer contouring revealed that, on a single-element basis, cesium, antimony and arsenic provided the best expression of the KGRA, and indicated other interesting areas of geothermal leakage. Elements such as beryllium and lithium, which are present in highly anomalous concentrations in the opaline sinter deposited by geothermal leakage, do not have an expression in the overlying soils or the surface micro-layer. Computer manipulation of the multi-element data using R-mode factor analysis provided the optimum method of interpretation of the surface micro-layer data. A single factor in which the principal contributors were arsenic, antimony and cesium provided the best indication of the leakage of geothermal solutions within the KGRA. Anomalies in the Escalante Desert to the west of the geothermal field are associated with the trace of a fault zone, and may, therefore, be an activity. Trend surface analysis of the soil mercury data has indicated a regional high in this element in the Mineral Mountains to the east of the KGRA, which may indicate the position of a dry heat source at depth.These data demonstrate that surface micro-layer sampling on a regional scale can serve as a prospecting tool for geothermal resource areas. However, it is possible that the optimum pathfinder elements may vary with the nature of heat source, the geochemistry of the local rocks and the local surficial environment. It is therefore recommended that a multi-element approach should be adopted, with subsequent computer processing of the data.  相似文献   

18.
In this study, multivariate statistical methods including factor, principal component and cluster analysis were applied to surface water quality data sets obtained from the Tahtali River Basin, Turkey. Factor and principal components analysis results revealed that surface water quality was mainly controlled by agricultural uses and domestic discharges. Cluster analysis generated two clusters. Based on the locations of the sites consisted by each cluster and variable concentrations at these stations, it was concluded that agricultural discharges strongly affected north and northeast part of the region. These methods are believed to assist water managers to understand complex nature of water quality issues and determine priorities to improve water quality.  相似文献   

19.
The study area is located in the southwestern part of Bangladesh. Twenty-six groundwater samples were collected from both shallow and deep tube wells ranging in depth from 20 to 60 m. Multivariate statistical analyses including factor analysis, cluster analysis and multidimensional scaling were applied to the hydrogeochemical data. The results show that a few factors adequately represent the traits that define water chemistry. The first factor of Fe and HCO3 is strongly influenced by bacterial Fe (III) reduction which would raise both Fe and HCO3 concentrations in water. Na, Cl, Ca, Mg and PO4 are grouped under the second factor representing the salinity sources of waters. The third factor, represented by As, Mn, SO4 and K is related to As mobilization processes. Cluster analysis has been applied for the interpretation of the groundwater quality data. Initially Piper methods have been employed to obtain a first idea on the water types in the study area. Hierarchical cluster analysis was carried out for further classification of water types in the study area. Twelve components, namely, pH, Fe, Mn, As, Ca, Mg, Na, K, HCO3, Cl, SO4 and NO3 have been used for this purpose. With hierarchical clustering analysis the water samples have been classified into 3 clusters. They are very high, high and moderately As-enriched groundwater as well as groundwater with elevated SO4.  相似文献   

20.
Whilst traditional approaches to geochemistry provide valuable insights into magmatic processes such as melting and element fractionation, by considering entire regional data sets on an objective basis using machine learning algorithms(MLAs), we can highlight new facets within the broader data structure and significantly enhance previous geochemical interpretations.The platinum-group element(PGE) budget of lavas in the North Atlantic Igneous Province(NAIP) has been shown to vary systematically according to age, geographic location and geodynamic environment.Given the large multi-element geochemical data set available for the region, MLAs were employed to explore the magmatic controls on these shifting concentrations.The key advantage of using machine learning in analysis is its ability to cluster samples across multi-dimensional(i.e., multi-element)space.The NAIP data set is manipulated using Principal Component Analysis(PCA) and t-Distributed Stochastic Neighbour Embedding(t-SNE) techniques to increase separability in the data alongside clustering using the k-means MLA.The new multi-element classification is compared to the original geographic classification to assess the performance of both approaches.The workflow provides a means for creating an objective high-dimensional investigation on a geochemical data set and particularly enhances the identification of metallogenic anomalies across the region.The techniques used highlight three distinct multi-element end-members which successfully capture the variability of the majority of elements included as input variables.These end-members are seen to fluctuate in prominence throughout the NAIP, which we propose reflects the changing geodynamic environment and melting source.Crucially, the variability of Pt and Pd are not reflected in MLA-based clustering trends, suggesting that they vary independently through controls not readily demonstrated by the NAIP major or trace element data structure(i.e., other proxies for magmatic differentiation).This data science approach thus highlights that PGE(here signalled by Pt/Pd ratio) may be used to identify otherwise localised or cryptic geochemical inputs from the subcontinental lithospheric mantle(SCLM) during the ascent of plume-derived magma, and thereby impact upon the resulting metallogenic basket.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号