Classification of multiclass imbalanced data using cost-sensitive decision tree C5.0

M. Aldiki Febriantono, Rahmadwati Rahmadwati, Erni Yudaningtyas, Golshah Naghdy

Abstract


The multiclass imbalanced data problems in data mining were an interesting to study currently. The problems had an influence on the classification process in machine learning processes. Some cases showed that minority class in the dataset had an important information value compared to the majority class. When minority class was misclassification, it would affect the accuracy value and classifier performance. In this research, cost sensitive decision tree C5.0 was used to solve multiclass imbalanced data problems. The first stage, making the decision tree model uses the C5.0 algorithm then the cost sensitive learning uses the metacost method to obtain the minimum cost model. The results of testing the C5.0 algorithm had better performance than C4.5 and ID3 algorithms. The percentage of algorithm performance from C5.0, C4.5 and ID3 were 40.91%, 40, 24% and 19.23%.

Keywords


Classification; Decision tree C5.0; Metacost; Multiclass; Particle swarm optimation



DOI: http://doi.org/10.11591/ijai.v8.i4.pp%25p
Total views : 29 times

Refbacks

  • There are currently no refbacks.


Copyright (c) 2019 Institute of Advanced Engineering and Science

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

View IJAI Stats

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.