We develop the classification part of a system that analyses transmitted light microscope images of dispersed kerogen preparation. The system automatically extracts kerogen pieces from the image and labels each piece as either inertinite or vitrinite. The image pre-processing analysis consists of background removal, identification of kerogen material, object segmentation, object extraction (individual images of pieces of kerogen) and feature calculation for each object. An expert palynologist was asked to label the objects into categories inertinite and vitrinite, which provided the ground truth for the classification experiment. Ten state-of-the-art classifiers and classifier ensembles were compared: Naïve Bayes, decision tree, nearest neighbour, the logistic classifier, multilayered perceptron (MLP), support vector machines (SVM), AdaBoost, Bagging, LogitBoost and Random Forest. The logistic classifier was singled out as the most accurate classifier, with an accuracy greater than 90. Using a 10 times 10-fold cross-validation provided within the Weka software, we found that the logistic classifier was significantly better than five classifiers (p<0.05) and indistinguishable from the other four classifiers. The initial set of 32 features was subsequently reduced to 6 features without compromising the classification accuracy. A further evaluation of the system alerted us to the possible sensitivity of the classification to the ground truth that might vary from one human expert to another. The analysis also revealed that the logistic classifier made most of the correct classifications with a high certainty. 相似文献
Landslide is a serious natural disaster next only to earthquake and flood, which will cause a great threat to people’s lives and property safety. The traditional research of landslide disaster based on experience-driven or statistical model and its assessment results are subjective , difficult to quantify, and no pertinence. As a new research method for landslide susceptibility assessment, machine learning can greatly improve the landslide susceptibility model’s accuracy by constructing statistical models. Taking Western Henan for example, the study selected 16 landslide influencing factors such as topography, geological environment, hydrological conditions, and human activities, and 11 landslide factors with the most significant influence on the landslide were selected by the recursive feature elimination (RFE) method. Five machine learning methods [Support Vector Machines (SVM), Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Linear Discriminant Analysis (LDA)] were used to construct the spatial distribution model of landslide susceptibility. The models were evaluated by the receiver operating characteristic curve and statistical index. After analysis and comparison, the XGBoost model (AUC 0.8759) performed the best and was suitable for dealing with regression problems. The model had a high adaptability to landslide data. According to the landslide susceptibility map of the five models, the overall distribution can be observed. The extremely high and high susceptibility areas are distributed in the Funiu Mountain range in the southwest, the Xiaoshan Mountain range in the west, and the Yellow River Basin in the north. These areas have large terrain fluctuations, complicated geological structural environments and frequent human engineering activities. The extremely high and highly prone areas were 12043.3 km2 and 3087.45 km2, accounting for 47.61% and 12.20% of the total area of the study area, respectively. Our study reflects the distribution of landslide susceptibility in western Henan Province, which provides a scientific basis for regional disaster warning, prediction, and resource protection. The study has important practical significance for subsequent landslide disaster management. 相似文献