Model optimisation of class imbalanced learning using ensemble classifier on over-sampling data

Yulia Ery Kurniawati, Yulius Denny Prabowo

Abstract


Data imbalance is one of the problems in the application of machine learning and data mining. Often this data imbalance occurs in the most essential and needed case entities. Two approaches to overcome this problem are the data level approach and the algorithm approach. This study aims to get the best model using the pap smear dataset that combined data levels with an algorithmic approach to solve data imbalanced. The laboratory data mostly have few data and imbalance. Almost in every case, the minor entities are the most important and needed. Over-sampling as a data level approach used in this study is the synthetic minority oversampling technique-nominal (SMOTE-N) and adaptive synthetic-nominal (ADASYN-N) algorithms. The algorithm approach used in this study is the ensemble classifier using AdaBoost and bagging with the classification and regression tree (CART) as learner-based. The best model obtained from the experimental results in accuracy, precision, recall, and f-measure using ADASYN-N and AdaBoost-CART.

Keywords


Adaptive synthetic-nominal class imbalance learning; Ensemble classifier; Over-sampling; Synthetic minority oversampling technique-nominal

Full Text:

PDF


DOI: http://doi.org/10.11591/ijai.v11.i1.pp276-283

Refbacks

  • There are currently no refbacks.


View IJAI Stats

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.