An objective replacement method for censored geochemical data |
| |
Authors: | Richard F. Sanford Charles T. Pierson Robert A. Crovelli |
| |
Affiliation: | (1) U.S. Geological Survey, Denver Federal Center, M.S. 905, 80225 Denver, Colorado |
| |
Abstract: | Geochemical data are commonly censored, that is, concentrations for some samples are reported as less than or greater than some value. Censored data hampers statistical analysis because certain computational techniques used in statistical analysis require a complete set of uncensored data. We show that the simple substitution method for creating an uncensored dataset, e.g., replacement by3/4 times the detection limit, has serious flaws, and we present an objective method to determine the replacement value. Our basic premise is that the replacement value should equal the mean of the actual values represented by the qualified data. We adapt the maximum likelihood approach (Cohen, 1961) to estimate this mean. This method reproduces the mean and skewness as well or better than a simple substitution method using3/4 of the lower detection limit or3/4 of the upper detection limit. For a small proportion of less than substitutions, a simple-substitution replacement factor of 0.55 is preferable to3/4; for a small proportion of greater than substitutions, a simple-substitution replacement factor of 1.7 is preferable to4/3, provided the resulting replacement value does not exceed 100%. For more than 10% replacement, a mean empirical factor may be used. However, empirically determined simple-substitution replacement factors usually vary among different data sets and are less reliable with more replacements. Therefore, a maximum likelihood method is superior in general. Theoretical and empirical analyses show that true replacement factors for less thans decrease in magnitude with more replacements and larger standard deviation; those for greater thans increase in magnitude with more replacements and larger standard deviation. In contrast to any simple substitution method, the maximum likelihood method reproduces these variations. Using the maximum likelihood method for replacing less thans in our sample data set, correlation coefficients were reasonably accurately estimated in 90% of the cases for as much as 40% replacement and in 60% of the cases for 80% replacement. These results suggest that censored data can be utilized more than is commonly realized. |
| |
Keywords: | substitution method qualified data lognormal distribution environmental science |
本文献已被 SpringerLink 等数据库收录! |
|