Performance comparison between naive bayes and K-nearest neighbor algorithm for the classification of Indonesian Language articles

Titin Winarti, Henny Indriyawati, Vensy Vydia, Febrian Wahyu Christanto

Abstract


The match between the contents of the article and the article theme is the main factor whether or not an article is accepted. Many people are still confused to determine the theme of the article appropriate to the article they have. For that reason we need a document classification Algorithm that can group the articles automatically and accurately. There are many classification Algorithm that can be used. The Algorithm used in this study is Naive Bayes and the K-Nearest Neighbor Algorithm is used as the baseline. The Naive Bayes Algorithm was chosen because it can produce maximum accuracy with little training data. While the K-Nearest Neighbor Algorithm was chosen because the Algorithm is robust against data noise. The performance of the two Algorithm will be compared, so it can be seen which Algorithm is better in classifying documents. The results obtained show that the Naive Bayes Algorithm has better performance with an accuracy rate of 88%, while the K-Nearest Neighbor Algorithm has a fairly low accuracy rate of 60%.

Keywords


Articles classification; Indonesian language articles; K-nearest neighbor; Naive bayes



DOI: http://doi.org/10.11591/ijai.v10.i2.pp%25p

Refbacks

  • There are currently no refbacks.


View IJAI Stats

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.