Performance comparison between naive bayes and k- nearest neighbor algorithm for the classification of Indonesian language articles

Titin Winarti, Henny Indriyawati, Vensy Vydia, Febrian Wahyu Christanto

Abstract


The match between the contents of the article and the article theme is the main factor whether or not an article is accepted. Many people are still confused to determine the theme of the article appropriate to the article they have. For that reason, we need a document classification algorithm that can group the articles automatically and accurately. Many classification algorithms can be used. The algorithm used in this study is naive bayes and the k-nearest neighbor algorithm is used as the baseline. The naive bayes algorithm was chosen because it can produce maximum accuracy with little training data. While the k-nearest neighbor algorithm was chosen because the algorithm is robust against data noise. The performance of the two algorithms will be compared, so it can be seen which algorithm is better in classifying documents. The comes about obtained show that the naive bayes algorithm has way better execution with an accuracy rate of 88%, while the k-nearest neighbor algorithm has a fairly low accuracy rate of 60%.

Keywords


Articles classification; Indonesian language articles; K-nearest neighbor; Naive bayes

Full Text:

PDF


DOI: http://doi.org/10.11591/ijai.v10.i2.pp452-457

Refbacks

  • There are currently no refbacks.


View IJAI Stats

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.