Total organic carbon (TOC) content is a crucial geochemical parameter for assessing reservoir quality and hydrocarbon generation potential of source rocks. The accurate prediction of TOC content is important for optimizing the exploration and development processes of shale oil and gas. With the rapid development of artificial intelligence technologies, individual machine learning algorithms have been increasingly applied to evaluate TOC content in shale. Despite the promising results of the individual machine learning algorithms, they are often subject to several challenges including overfitting, underfitting, and getting trapped in local optima of objective function. To address these limitations, the ensemble learning models are developed. Ensemble learning models leverage the strengths of multiple individual intelligent algorithms to enhance prediction accuracy and stability. Among them, combination strategy is one of the key factors in optimizing the ensemble learning models. Arithmetic average method as the simplest combination strategy fails to fully use prediction performance of the best individual intelligent model, and it can be severely affected by the individual intelligent model with a large prediction error, which can interfere with prediction outcome of overall model. In comparison, weighted summation method as a common combination strategy assigns the weights to different individual intelligent models according to their performance on training data. This method will perform excellently on training set, but it tends to have a poor performance when applied to test set. This paper develops an ensemble model based on an intelligent matching technology (IMTEM). The proposed method utilizes a set of robust intelligent algorithms including extreme gradient boosting, random forest, support vector machine, and extreme learning machine as algorithm modules to initially process input data. Then, the processed feature information combined with original log responses is fed to feedforward neural network layer for nonlinear transformation and feature learning, thereby enabling accurate and continuous estimation of TOC content in shale. To validate effectiveness of the IMTEM, the proposed method is applied to the prediction of TOC content in the Longmaxi Formation shale in the Sichuan Basin. Test results indicate that, compared to two ensemble models, five baseline models, and the ΔlogR method, predictions of the IMTEM exhibit higher consistency with measured TOC content. This demonstrates that the IMTEM is more suitable for predicting TOC content in shale.
|