An enhanced cascade ensemble method for big data analysis

Ivan Izonin, Roman Muzyka, Roman Tkachenko, Michal Gregus, Roman Korzh, Kyrylo Yemets

Abstract


In the digital age, the proliferation of data presents both challenges and opportunities, particularly in the realm of big data, which is characterized by its volume, velocity, and variety. Machine learning is a crucial technology for extracting insights from these vast datasets. Among machine learning methods, ensemble methods, and especially cascading ensembles, are highly effective for big data analysis. While it is true that the training procedures for cascade ensembles can be time-consuming and may have limitations in terms of accuracy, this paper proposes a solution to enhance their performance. Our method involves using stochastic gradient descent (SGD) classifiers, an improved training data separation algorithm, and integrating principal component analysis (PCA) at each ensemble level. We are confident that these enhancements lead to improved results and accuracy. The proposed approach is designed to enhance both the generalization properties and accuracy of the ensemble (3%), while also reducing its training time. Results from modelling on a real-world biomedical dataset demonstrate significant reductions in training duration, improvements in generalization properties, and enhanced accuracy when compared to other possible implementations of the ensemble.

Keywords


Big data analysis; Binary classification task; Cascade ensemble; Imbalanced dataset; Kolmogorov-Gabor polynomial; Machine learning; Wiener polynomial;

Full Text:

PDF


DOI: http://doi.org/10.11591/ijai.v14.i2.pp963-974

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats