首页 | 本学科首页   官方微博 | 高级检索  
     检索      

电磁勘探数据粗大误差处理的一种新方法
引用本文:张必明,蒋奇云,莫丹,肖龙英.电磁勘探数据粗大误差处理的一种新方法[J].地球物理学报,2015,58(6):2087-2102.
作者姓名:张必明  蒋奇云  莫丹  肖龙英
作者单位:1. 中南大学地球科学与信息物理学院, 长沙 410083; 2. 湖南涉外经济学院信息科学与工程学院, 长沙 410205
基金项目:国家自然科学基金重大科研仪器设备研制专项(41227803)资助.
摘    要:由于存在各种地电干扰,电磁法勘探采集到的原始电场数据中往往包含粗大误差.电磁法勘探中信号量的测量与传统的精密测量在误差来源与特点、测量值分布等方面均存在较大差异.经试验,对电磁勘探采集到的原始电场数据采用传统的莱伊达、格拉布斯、狄克逊等准则进行粗大误差的自动判别和剔除,处理效果不好;采用Robust估计和中值滤波方法,也不能达到满意的效果;采用手工方式挑拣剔除粗大误差,处理效率太低,均不能满足电磁勘探数据预处理的要求.作者提出了一种自适应双向均方差阈值法实现电磁勘探数据粗大误差的自动判别和剔除,此方法对采集到的原始电场数据样本进行排序后,采用迭代或递归的方式,每次均以中点为界分别计算前后两部分数据的均方差,将较大的一个与预先设置的均方差阈值进行比较,若其大于阈值,则判断粗大误差存在于相应的一端,进而剔除相应端端点位置的数据点;若前后均方差值都小于阈值或样本数量小于3个时算法结束.此方法具有自适应优化、阈值参数化控制、适应小样本数据以及计算简单效率高等特点.大量实验结果表明:在选取均方差阈值在30至90范围内时(经验值),能够有效地剔除电磁勘探原始电场数据中的粗大误差,保留最可信数据.目前已在多个实际勘探生产项目中应用此方法处理粗大误差,取得了令人满意的处理效果.

关 键 词:电磁勘探  粗大误差处理  均方差阈值  
收稿时间:2014-02-27

A novel method for handling gross errors in electromagnetic prospecting data
ZHANG Bi-Ming,JIANG Qi-Yun,MO Dan,XIAO Long-Ying.A novel method for handling gross errors in electromagnetic prospecting data[J].Chinese Journal of Geophysics,2015,58(6):2087-2102.
Authors:ZHANG Bi-Ming  JIANG Qi-Yun  MO Dan  XIAO Long-Ying
Institution:1. College of Geosciences and Info-Physics, Central South University, Changsha 410083, China; 2. College of Information Science and Engineering, Hunan International Economics University, Changsha 410205, China
Abstract:Because of various interference factors, there are more or less gross errors mixed in raw electric field data from electromagnetic (EM) prospecting. The sources, features of such errors, as well as distribution and range of measurements of EM differ much from traditional precise surveys. We have tried several traditional methods in experiments such as Pauta, Grubbs and Dixon criteria to detect and eliminate gross errors in EM prospecting data, but the result is not satisfactory. Even using the Robust Estimation and Median Filtering methods can not obtain satisfactory results either. Meanwhile, the efficiency is very poor by manual operation for this work. So, we need a suitable method/algorithm to solve this problem. We proposed a new adaptive bidirectional MSD (Mean Square Deviation) threshold method to detect and eliminate gross errors in raw data of EM prospecting. This method sorts raw sample data set at first, and then adopts an iterative or recursive way to separately calculate two MSD values repeatedly for the front and rear part of the sample data set which is separated at the array midpoint. Afterwards, it picks the larger one of the two calculated MSDs to compare with the MSD threshold value. If the picked one is larger than the threshold, then we delete the very one data point at endpoint in corresponding part, otherwise the algorithm finished. In addition, if the size of data sample is less than 3 after deletion, the algorithm finished as well. To test and verify this method, we used lots of real raw data of prospecting sites from two industrial projects; one from a shale gas prospecting project, the other from an oil-gas prospecting project. We try various values as the MSD threshold to obtain satisfactory handled results. Consequently the value between 30 and 90 shows pretty good. Based on experimental data, we compare our method with some traditional methods such as Pauta, Grubbs and Dixon criteria, and Robust Estimation and Median Filtering methods. In the first experiment case, bad data quality causes the tail part (below 0.03125 Hz) of full 40 frequencies curve to exhibit a sharp peak shape. After handling gross errors by our new method and Pauta, Grubbs and Dixon criteria, the results show that the new method smoothes the sharp peak shape segment well and keeps the rest of the curve almost unchanged, while the Pauta and Dixon criterion methods have no effect to the sharp peak shape segment, actually producing no change for the whole curve. While the Grubbs criterion method smoothes the sharp peak shape of the curve, but also deletes the most of samples (include most of credible samples) while only 2 samples left for every frequency point after processing. So this result is not good for subsequent procedures such as further data processing and interpretation. To go deep into details, compared the raw sample data set of some frequencies below 0.03125 Hz with handled results of our method, we find that the apparent gross error sample points are eliminated correctly and the most credible samples are preserved. In the same case, we tried 4 typical MSD threshold values as 90, 60, 30 and 15, respectively. The results show that the smaller threshold value cannot make the result obviously better below 30, but more suspected sample points are removed. It means the value 30 is pretty good. In the second experiment case, we compare our method with the classical Robust Estimation method. The result shows that the two methods are quite consistent above 0.03125 Hz, both smooth the sharp peak at 48 Hz and keep the rest curve shape almost no change, but apparent difference appears at the tail part below 0.03125 Hz, in which our method is better than the Robust Estimation. Furthermore, if the sample size is small (less than 20) and the proportion of gross error samples is not small or the value of gross error samples is very big, especially those frequencies below 0.03125 Hz, the Robust Estimation method cannot obtain satisfactory results. At the same time, both the Robust Estimation and our method work pretty well at 48 Hz. In the third experiment case, we adopt the two same experiment data samples used at the first and the second cases and use the SMF (Standard Median Filtering) method to handle these data samples and compared the results with our method. The results show that in first example the SMF has no effect on the peak shape at 0.023438 Hz. The reason is that there are three big outliers just at the position of rear endpoint of the sample and SMF can not handle outliers at this point (either head or tail) as an inherent limitation. The second example is better than the first, but not better than our method. Finally, we analyze differences between Median Filtering and our method at the end of section in this case. Besides these three experiment cases, lots of experiments illustrate that our method can eliminate reliably gross errors mixed in EM data, and keep the most credible data as much as possible. By choosing an appropriate MSD threshold value between 30 and 90, we can achieve satisfactory results. This work suggested a novel method based on the statistical theory. It has some good features such as adaptive optimization, threshold parameterization adaptive for small sample data size and large proportion of gross error sample, being easy to program implementation and low computation cost, high efficiency. The MSD value between 30 and 90 is determined by mass of practice data handling, which be proved appropriate for handling EM prospecting data. Theoretical analysis and experiments show this method has better adaptability and handling results than the classical methods as Pauta, Grubbs and Dixon criteria. Furthermore, in conditions of small sample data size (less than 20), large proportion of gross error and huge values of gross errors, our method is better in method adaptability and can obtain more satisfactory results than Robust Estimation. In other conditions, the handling results of our method and Robust Estimation are quite consistent. Based on experiments and theoretical analysis, in the suitability of the application field, inherent limitation of algorithm and the consistency of parameters of algorithm, our method is also superior to the Median Filtering method.
Keywords:Electromagnetic prospecting  Gross error handling  MSD threshold
本文献已被 CNKI 等数据库收录!
点击此处可从《地球物理学报》浏览原始摘要信息
点击此处可从《地球物理学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号