A three-step combination strategy for addressing outliers and class imbalance in software defect prediction

Muhammad Rizky Pribadi, Hindriyanto Dwi Purnomo, Hendry Hendry

Abstract


Software defect prediction often involves datasets with imbalanced distributions where one or more classes are underrepresented, referred to as the minority class, while other classes are overrepresented, known as the majority class. This imbalance can hinder accurate predictions of the minority class, leading to misclassification. While the synthetic minority oversampling technique (SMOTE) is a widely used approach to address imbalanced learning data, it can inadvertently generate synthetic minority samples that resemble the majority class and are considered outliers. This study aims to enhance SMOTE by integrating it with an efficient algorithm designed to identify outliers among synthetic minority samples. The resulting method, called reduced outliers (RO)-SMOTE, is evaluated using an imbalanced dataset, and its performance is compared to that of SMOTE. RO-SMOTE first performs oversampling on the training data using SMOTE to balance the dataset. Next, it applies the mining outlier algorithm to detect and eliminate outliers. Finally, RO-SMOTE applies SMOTE again to rebalance the dataset before introducing it to the underlying classifier. The experimental results demonstrate that RO-SMOTE achieves higher accuracy, precision, recall, F1-score, and area under curve (AUC) values compared to SMOTE.


Keywords


Classification; Imbalanced data; Outliers; Software defect prediction; Synthetic minority oversampling technique

Full Text:

PDF


DOI: http://doi.org/10.11591/ijai.v13.i3.pp2987-2998

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats