An enhanced cascade ensemble method for big data analysis
Abstract
In the digital age, the proliferation of data presents both challenges and opportunities, particularly in the realm of big data, which is characterized by its volume, velocity, and variety. Machine learning is a crucial technology for extracting insights from these vast datasets. Among machine learning methods, ensemble methods, and especially cascading ensembles, are highly effective for big data analysis. While it is true that the training procedures for cascade ensembles can be time-consuming and may have limitations in terms of accuracy, this paper proposes a solution to enhance their performance. Our method involves using stochastic gradient descent (SGD) classifiers, an improved training data separation algorithm, and integrating principal component analysis (PCA) at each ensemble level. We are confident that these enhancements lead to improved results and accuracy. The proposed approach is designed to enhance both the generalization properties and accuracy of the ensemble (3%), while also reducing its training time. Results from modelling on a real-world biomedical dataset demonstrate significant reductions in training duration, improvements in generalization properties, and enhanced accuracy when compared to other possible implementations of the ensemble.
Keywords
Big data analysis; Binary classification task; Cascade ensemble; Imbalanced dataset; Kolmogorov-Gabor polynomial; Machine learning; Wiener polynomial;
Full Text:
PDFDOI: http://doi.org/10.11591/ijai.v14.i2.pp963-974
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).