首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 625 毫秒
1.
Increasing attention in recent years has been devoted to the application of statistical techniques in the analysis and interpretation of geologic and oceanographic data. Equally important, but less well explored, are methods for efficient experimental design. The theory of linear programming provides plans for optimal sampling of geologic and oceanographic phenomena. Of particular significance are solutions to problems of multivariate sampling. Often, a single field sample may be analyzed for a number of oxides, or a number of minerals, or a number of textural parameters. In general, these variables differ in the degree to which they are diagnostic of changes in the phenomenon of interest, and thus they must be known with different levels of precision if they are to be useful. Similarly, the variables differ in the ease with which they may be measured. If a sampling plan is to be most efficient, it must provide the requisite levels of precision for the minimum expenditure of time and effort. Sampling for a single variable may be optimized directly. Sampling for several variables simultaneously usually introduces special difficulties, but if the objective function can be generalized to hold for all variables, solutions can be determined even in this situation.  相似文献   

2.
The nature of the information - and its usefulness - that can be obtained from some of the more common multivariate statistical techniques is illustrated by their application to H and B horizon soils analyzed for 16 variables from the vicinity of the Key Anacon massive sulphide deposit (New Brunswick, Canada) where the geochemical response is erratic and the contrasts low. The theoretical bases of the statistical techniques are given in an appendix.Stepwise linear discriminant functions are employed as a classifying technique on H horizon soils to identify a regional anomaly, and on B horizon soils to derive discriminant scores to define the location of the sulphide zone. R-mode nonlinear mapping (RNLM) indicates that the variables Hg, Cl, and conductance are closely correlated with the organic carbon content; variations in the latter are clearly related to variations in the secondary environment.Q-mode NLM using Pb, Zn, Co, and Ni as variables (identified as good discriminators by stepwise linear discriminant function) identify “outlying” anomalous samples related to the mineralized zones. A principal component biplot (using the same variables employed in the Q-mode NLM) indicates essentially the same samples as being anomalous as the Q-mode NLM. The biplot technique has the added advantage of also indicating which variable(s) is responsible for any particular sample being identified as anomalous.The practical result of the investigation is that the zone of sulphide mineralization defined by drilling and from underground data is confidently identified and defined. A number of geophysical magnetic anomalies over the same stratigraphic horizon has a relatively weak or no geochemical response, although a few locations are defined as second priority targets for follow-up work.  相似文献   

3.
This paper addresses the problem of quantifying the joint uncertainty in the grades of elements of interest (iron, silica, manganese, phosphorus and alumina), loss on ignition, granulometry and rock types in an iron ore deposit. Sampling information is available from a set of exploration drill holes. The methodology considers the construction of multiple rock type outcomes by plurigaussian simulation, then outcomes of the quantitative variables (grades, loss on ignition and granulometry) are constructed by multigaussian joint simulation, accounting for geological domains specific to each quantitative variable as well as for a stoichiometric closure formula linking these variables. The outcomes are validated by checking the reproduction of the data distributions and of the data values at the drill hole locations, and their ability to measure the uncertainty at unsampled locations is assessed by leave-one-out cross validation.Both the plurigaussian and multigaussian models offer much flexibility to the practitioner to face up to the complexity of the variables being modeled, in particular: (1) the contact relationships between rock types, (2) the geological controls exerted by the rock types over the quantitative variables, and (3) the cross-correlations and stoichiometric closure linking the quantitative variables. In addition to this flexibility, the use of efficient simulation algorithms turns out to be essential for a successful application, due to the high number of variables, data and locations targeted for simulation.  相似文献   

4.
River pollution data are characterized by high variability. Multivariate statistical methods help to determine a complex set of these multidimensional data and to extract latent information (e.g. differently polluted areas, discharges). The chemometric methods can handle interactions between different pollutants and relationships among various sampling locations. This study presents an application of multivariate data analysis in the field of environmental pollution. The dataset consists of As, Cd, Cr, Cu, Fe, Ni, Mn, Pb and Zn contents of sediment samples collected in the upper and middle Odra River (Poland) in three sampling campaigns (November 1998, June 1999, and May 2000). As chemometric tools cluster analysis (CA), multivariate analysis of variance and discriminant analysis (MVDA) and factor analysis (FA) were used to investigate the matrix of 60 sampling points.  相似文献   

5.
 A new method of standardizing metal concentrations in sediments was tested on samples from Lake Miccosukee, a large karstic lake in north Florida. Metal concentrations were analyzed in 222 sediment samples from 26 cores representing 9 sampling sites in the lake. Measured sedimentation rates in the lake are low. Percent organic matter strongly increases upward in all the cores. The C/N ratio remains constant throughout all the samples, with a mean value of about 13, regardless of depth or location. All of the geochemical variables are at least approximately log-normally distributed; thus, log-log or semi-log scattergrams were used and the data were log-transformed before statistical calculations were performed. Some elements (Mn, Zn, Hg, Cu, and Ca) are primarily associated with the organic fraction; others (La, Cr, Sr, and Ba) are clearly related to the terrigenous fraction; others show affinities for both fractions. Consequently, no bivariate scattergrams or plots of ratio versus depth – commonly used for standardization by plotting or ratioing a reference element (such as Al) to an element of interest – were found to be adequate for standardization of this dataset. The best method for standardization was found to be one based on multivariate (trivariate) linear regression, using log Al and log C as the independent variables (reference elements representing terrigenous and organic fractions, respectively), and the log of the element of interest as the dependent variable. Residuals (deviations) from the best-fit linear surface were then plotted versus depth in the cores to accomplish the standardization. The results indicate that, with the possible exception of Mn at two sites, there is little evidence of anthropogenic input of trace elements to the lake, and most trace-element concentrations in the lake can be considered as valuable baseline information. A significant finding is that different and erroneous conclusions might have been reached if other standardization methods, not based on trivariate regression, had been employed. Received: 28 August 1997 · Accepted: 24 November 1997  相似文献   

6.
A geotechnical problem that involves several spatially correlated parameters can be best described using multivariate cross-correlated random fields. The joint distribution of these random variables cannot be uniquely defined using their marginal distributions and correlation coefficients alone. This paper presents a generic methodology for generating multivariate cross-correlated random fields. The joint distribution is rigorously established using a copula function that describes the dependence structure among the individual variables. The cross-correlated random fields are generated through Cholesky decomposition and conditional sampling based on the joint distribution. The random fields are verified regarding the anisotropic scales of fluctuation and copula parameters.  相似文献   

7.
Stability is a key issue in any mining or tunnelling activity. Joint frequency constitutes an important input into stability analyses. Three techniques are used herein to quantify the local and spatial joint frequency uncertainty, or possible joint frequencies given joint frequency data, at unsampled locations. Rock quality designation is estimated from the predicted joint frequencies. The first method is based on kriging with subsequent Poisson sampling. The second method transforms the data to near-Gaussian variables and uses the turning band method to generate a range of possible joint frequencies. The third method assumes that the data are Poisson distributed and models the log-intensity of these data with a spatially smooth Gaussian prior distribution. Intensities are obtained and Poisson variables are generated to examine the expected joint frequency and associated variability. The joint frequency data is from an iron ore in the northern part of Norway. The methods are tested at unsampled locations and validated at sampled locations. All three methods perform quite well when predicting sampled points. The probability that the joint frequency exceeds 5 joints per metre is also estimated to illustrate a more realistic utilisation. The obtained probability map highlights zones in the ore where stability problems have occurred. It is therefore concluded that the methods work and that more emphasis should have been placed on these kinds of analyses when the mine was planned. By using simulation instead of estimation, it is possible to obtain a clear picture of possible joint frequency values or ranges, i.e. the uncertainty.  相似文献   

8.
受工程勘察成本及试验场地限制,可获得的试验数据通常有限,基于有限的试验数据难以准确估计岩土参数统计特征和边坡可靠度。贝叶斯方法可以融合有限的场地信息降低对岩土参数不确定性的估计进而提高边坡可靠度水平。但是,目前的贝叶斯更新研究大多假定参数先验概率分布为正态、对数正态和均匀分布,似然函数为多维正态分布,这种做法的合理性有待进一步验证。总结了岩土工程贝叶斯分析常用的参数先验概率分布及似然函数模型,以一个不排水黏土边坡为例,采用自适应贝叶斯更新方法系统探讨了参数先验概率分布和似然函数对空间变异边坡参数后验概率分布推断及可靠度更新的影响。计算结果表明:参数先验概率分布对空间变异边坡参数后验概率分布推断及可靠度更新均有一定的影响,选用对数正态和极值I型分布作为先验概率分布推断的参数后验概率分布离散性较小。选用Beta分布和极值I型分布获得的边坡可靠度计算结果分别偏于保守和危险,选用对数正态分布获得的边坡可靠度计算结果居中。相比之下,似然函数的影响更加显著。与其他类型似然函数相比,由多维联合正态分布构建的似然函数可在降低对岩土参数不确定性估计的同时,获得与场地信息更为吻合的计算结果。另外,构建似然函数时不同位置处测量误差之间的自相关性对边坡后验失效概率也具有一定的影响。  相似文献   

9.
A system of interactive graphic computer programs for multivariate statistical analysis of geoscience data (SIMSAG)has been developed to facilitate the construction of statistical models to evaluate potential mineral and energy resources from geoscience data. The system provides an integrated interactive package for graphic display, data management, and multivariate statistical analysis. It is specifically designed to analyze and display spatially distributed information which includes the geographic locations of observations. SIMSAG enables the users not only to perform several different types of multivariate statistical analysis but also to display the data selected or the results of analyses in map form. In the analyses of spatial data, graphic displays are particularly useful for interpretation, because the results can be easily compared with known spatial characteristics of the data. The system also permits the user to modify variables and select subareas imposed by cursor. All operations and commands are performed interactively via a graphic computer terminal. A case study is presented as an example. It consists of the construction of a statistical model for evaluating potential areas for explorations of uranium from geological, geophysical, geochemical, and mineral occurrence map data quantified for equalarea cells in Kasmere Lake area in Manitoba, Canada.  相似文献   

10.
Infill-sampling design and the Cost of classification errors   总被引:2,自引:0,他引:2  
The criterion used to select infill sample locations should depend on the sampling objective. Minimizing the global estimation variance is the most widely used criterion and is suitable for many problems. However, when the objective of the sampling program is to partition an area of interest into zones of high values and zones of low values, minimizing the expected cost of classification errors is a more appropriate criterion. Unlike the global estimation variance, the cost of classification errors incorporates both the sample locations and the sample values into an objective infill-sampling design criterion.  相似文献   

11.
Environmental geochemistry has attracted increasing interest during the last decade. In Sweden, geochemical mapping is carried out with methods that allow the data to be used in environmental research, including sampling plant roots and mosses from streams, soils and bedrock. These three sample types form an integrated strategy in environmental research, as well as in geochemical exploration. However, one problem that becomes prominent in geochemical mapping is to distinguish the signals derived from natural sources from those derived from anthropogenic sources. So far, this has mostly been done by using different types of samples, for example, different soil horizons. This is both expensive and time-consuming.We are currently developing alternative statistical solutions to this problem. The method used here is PLSR (partial least squares regression analysis). In this paper, we present an initial discussion on the applicability of PLSR in differentiating anthropogenic anomalies from natural contents.PLSR performs a simultaneous, interdependent principal component analysis decomposition in both X- and Y-matrices, in such a way that the information in the Y-matrix is used directly as a guide for optimal decomposition of the X-matrix. PLSR thus performs a generalized multivariate regression of Y on X overcoming the multicollinearity problem of correlated X-variables. The advantage of PLSR is that it gives optimal prediction ability in a strict statistical sense.Bedrock geochemistry from different lithologies in the mapping area in southern Sweden (Y-matrix) is analyzed together with stream or soil data (X-matrix). By modelling the PLS-regression between these two data sets, separate multivariate geochemical models based on the different bedrock types were developed. This step is called the training or modelling stage of the multivariate calibration. These calibrated models are subsequently used for predicting new (X) geochemical samples and estimating the corresponding Y-variable values. Information is obtained on how much of the metal contents in each new geochemical sample correlate with the different modelled bedrock types.By computing the appropriate X-residuals, we obtain information on the anthropogenic impact that is also carried by these new samples. In this way, it is possible from one single geochemical survey to derive both conventional geochemical background data and anthropogenic data, both of which can be readily displayed as maps.The present study concerns development of data analysis methods. Examples of the applications of the methodology are presented using Pb and U. The results show the share of these contents in different sampling media that is derived from bedrock on the one hand, and from anthropogenic sources, on the other.  相似文献   

12.
In reliability analysis, the crude Monte Carlo method is known to be computationally demanding. To improve computational efficiency, this paper presents an importance sampling based algorithm that can be applied to conduct efficient reliability evaluation for axially loaded piles. The spatial variability of soil properties along the pile length is considered by random field modeling, in which a mean, a variance, and a correlation length are used to statistically characterize a random field. The local averaging subdivision technique is employed to generate random fields. In each realization, the random fields are used as inputs to the well-established load transfer method to evaluate the load–displacement behavior of an axially loaded pile. Failure is defined as the event where the vertical movement at the pile top exceeds the allowable displacement. By sampling more heavily from the region of interest and then scaling the indicator function back by a ratio of probability densities, a faster rate of convergence can be achieved in the proposed importance sampling algorithm while maintaining the same accuracy as in the crude Monte Carlo method. Two examples are given to demonstrate the accuracy and the efficiency of the proposed method. It is shown that the estimate based on the proposed importance sampling method is unbiased. Furthermore, the size of samples can be greatly reduced in the developed method.  相似文献   

13.
基于典型相关分析的遥感影像变化检测   总被引:3,自引:0,他引:3  
多通道遥感影像由于通道之间相关性的影响,相对于单通道影像的变化检测更为困难,因此需要有效的集中分布在各个通道上的变化信息,构造出不同时相之间的差异影像,以便于变化信息的分析解译。针对多通道变化信息集中的难点和通道之间相关性的影响难以消除的问题,引入多元统计分析中的典型相关分析方法,将2个时相的多通道遥感影像示作2组多元随机变量,采用多元变化检测变换,对多个波谱通道上的所有差异信息或变化信息进行重组,分配到一组互不相关的结果变量中,最大限度地消除通道间的相关性对变化检测的不利影响,初步解决了差异影像构造的问题。  相似文献   

14.
Two optimization techniques ta predict a spatial variable from any number of related spatial variables are presented. The applicability of the two different methods for petroleum-resource assessment is tested in a mature oil province of the Midcontinent (USA). The information on petroleum productivity, usually not directly accessible, is related indirectly to geological, geophysical, petrographical, and other observable data. This paper presents two approaches based on construction of a multivariate spatial model from the available data to determine a relationship for prediction. In the first approach, the variables are combined into a spatial model by an algebraic map-comparison/integration technique. Optimal weights for the map comparison function are determined by the Nelder-Mead downhill simplex algorithm in multidimensions. Geologic knowledge is necessary to provide a first guess of weights to start the automatization, because the solution is not unique. In the second approach, active set optimization for linear prediction of the target under positivity constraints is applied. Here, the procedure seems to select one variable from each data type (structure, isopachous, and petrophysical) eliminating data redundancy. Automating the determination of optimum combinations of different variables by applying optimization techniques is a valuable extension of the algebraic map-comparison/integration approach to analyzing spatial data. Because of the capability of handling multivariate data sets and partial retention of geographical information, the approaches can be useful in mineral-resource exploration.  相似文献   

15.
Uncertainty quantification for subsurface flow problems is typically accomplished through model-based inversion procedures in which multiple posterior (history-matched) geological models are generated and used for flow predictions. These procedures can be demanding computationally, however, and it is not always straightforward to maintain geological realism in the resulting history-matched models. In some applications, it is the flow predictions themselves (and the uncertainty associated with these predictions), rather than the posterior geological models, that are of primary interest. This is the motivation for the data-space inversion (DSI) procedure developed in this paper. In the DSI approach, an ensemble of prior model realizations, honoring prior geostatistical information and hard data at wells, are generated and then (flow) simulated. The resulting production data are assembled into data vectors that represent prior ‘realizations’ in the data space. Pattern-based mapping operations and principal component analysis are applied to transform non-Gaussian data variables into lower-dimensional variables that are closer to multivariate Gaussian. The data-space inversion is posed within a Bayesian framework, and a data-space randomized maximum likelihood method is introduced to sample the conditional distribution of data variables given observed data. Extensive numerical results are presented for two example cases involving oil–water flow in a bimodal channelized system and oil–water–gas flow in a Gaussian permeability system. For both cases, DSI results for uncertainty quantification (e.g., P10, P50, P90 posterior predictions) are compared with those obtained from a strict rejection sampling (RS) procedure. Close agreement between the DSI and RS results is consistently achieved, even when the (synthetic) true data to be matched fall near the edge of the prior distribution. Computational savings using DSI are very substantial in that RS requires \(O(10^5\)\(10^6)\) flow simulations, in contrast to 500 for DSI, for the cases considered.  相似文献   

16.
《Engineering Geology》2007,89(1-2):47-66
This work describes the application of Logistic Regression (LR) to an assessment of susceptibility to mass movements in a 850 km2 study area mainly on the Ionian side of the Aspromonte Range, in southern Calabria.LR is a multivariate function that can be utilised, on the basis of a given set of variables, to calculate the probability that a particular phenomenon (for instance, a landslide) is present. In the present study the set of relevant variables includes: rock type, land use, elevation, slope angle, aspect, slope profile curvature down-slope and across-slope.The aim of this paper is to evaluate the LR performance when the procedure is based on the surveying of mass movements in part of the study area. The procedure adopted was GIS-based, with a 10 m DEM square-grid; for slope and curvature calculation, four adjacent cells were grouped to form a nine-point set for mathematical processing.The LR application consists of four steps: sampling, where all relevant characteristics in a part of the area (ca. 27% of the study zone) are assessed; variable parameterisation, where non-parametric variables are transformed into parametric (or semi-parametric) variables (on at least rank scale); model fitting, where regression coefficients are iteratively calculated in the sample area; model application, where the best-fit regression function is applied to the entire study area. This procedure was applied in two ways: first considering all types, then a single type of mass movement.The ground characteristics of the whole study zone were determined. The LR procedure was first tested by extending the sampling and reclassification steps to the whole study zone to find out the best possible fitting regression; the results of this were then compared with ground truth to maximise performance. Afterwards, the results of LR analysis, based on extension of regression formulas obtained also using 40% sampling zones, were compared with those of the best possible one and ground truth. Comparisons were performed by means of a confusion matrix and a simple correlation between expected vs. observed values for grouped variables. The overall results seem promising: for example, if the 27% sample areas are adopted, 94% of the cells where the probability of the existence of any kind of mass movements is between 85.5% and 95%, are actually affected by mass movements. Results are instead less good when attempting to distinguish between types of mass movement.  相似文献   

17.
单变量水文统计中一些广为接受的概念在多变量环境下尚缺乏深入分析,也易被误解,如N年内重现期大于等于T的多变量事件发生的次数与N/T的关系。实践中,多变量联合重现期与其边缘分布变量重现期的一些经验关系被发现并通过了案例验证分析,但缺乏解释和推导。基于GH Copula推导了双变量联合重现期与边缘分布变量重现期的关系以及双变量事件发生次数与其重现期、变量相关程度间的定量关系。以昆明56年的逐月SPI(Standardized Precipitation Index)和SRI(Standardized Runoff Index)识别了干旱事件,采用GH Copula构建了干旱历时和烈度的联合分布函数,验证了双变量联合重现期与边缘分布变量重现期的关系以及多变量事件发生次数与其重现期的定量关系。表明不宜以“and”第1重现期是否接近于比该干旱事件的旱情更重的干旱发生的平均时间间隔来说明干旱特征值重现期分析的合理性。变量的相关性不强时,需谨慎采用边缘分布变量重现期的较大值近似代替“and”事件的第1重现期。  相似文献   

18.
Soil pollution data collection typically studies multivariate measurements at sampling locations, e.g., lead, zinc, copper or cadmium levels. With increased collection of such multivariate geostatistical spatial data, there arises the need for flexible explanatory stochastic models. Here, we propose a general constructive approach for building suitable models based upon convolution of covariance functions. We begin with a general theorem which asserts that, under weak conditions, cross convolution of covariance functions provides a valid cross covariance function. We also obtain a result on dependence induced by such convolution. Since, in general, convolution does not provide closed-form integration, we discuss efficient computation. We then suggest introducing such specification through a Gaussian process to model multivariate spatial random effects within a hierarchical model. We note that modeling spatial random effects in this way is parsimonious relative to say, the linear model of coregionalization. Through a limited simulation, we informally demonstrate that performance for these two specifications appears to be indistinguishable, encouraging the parsimonious choice. Finally, we use the convolved covariance model to analyze a trivariate pollution dataset from California.  相似文献   

19.
Cokriging allows the use of data on correlated variables to be used to enhance the estimation of a primary variable or more generally to enhance the estimation of all variables. In the first case, known as the undersampled case, it allows data on an auxiliary variable to be used to make up for an insufficient amount of data. Original formulations required that there be sufficiently many locations where data is available for both variables. The pseudo-cross-variogram, introduced by Clark et al. (1989), allows computing a related empirical spatial function in order to model the function, which can then be used in the cokriging equations in lieu of the cross-variogram. A number of questions left unanswered by Clark et al. are resolved, such as the availability of valid models, an appropriate definition of positive-definiteness, and the relationship of the pseudo-cross-variogram to the usual cross-variogram. The latter is important for modeling this function.  相似文献   

20.
Guidelines are determined for the spatial density and location of climatic variables (temperature and precipitation) that are appropriate for estimating the continental- to hemispheric-scale pattern of atmospheric circulation (sea-level pressure). Because instrumental records of temperature and precipitation simulate the climatic information that is contained in certain paleoenvironmental records (tree-ring, pollen, and written-documentary records, for example), these guidelines provide useful sampling strategies for reconstructing the pattern of atmospheric circulation from paleoenvironmental records. The statistical analysis uses a multiple linear regression model. The sampling strategies consist of changes in site density (from 0.5 to 2.5 sites per million square kilometers) and site location (from western North American sites only to sites in Japan, North America, and western Europe) of the climatic data. The results showed that the accuracy of specification of the pattern of sea-level pressure: (1) is improved if sites with climatic records are spread as uniformly as possible over the area of interest; (2) increases with increasing site density-at least up to the maximum site density used in this study; (3) is improved if sites cover an area that extends considerably beyond the limits of the area of interest. The accuracy of specification was lower for independent data than for the data that were used to develop the regression model; some skill was found for almost all sampling strategies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号