Machine learning for text document classification-efficient classification approach

Sura I. Mohammed Ali, Marwah Nihad, Hussien Mohamed Sharaf, Haitham Farouk

Abstract


Numerous alternative methods for text classification have been created because of the increase in the amount of online text information available. The cosine similarity classifier is the most extensively utilized simple and efficient approach. It improves text classification performance. It is combined with estimated values provided by conventional classifiers such as Multinomial Naive Bayesian (MNB). Consequently, combining the similarity between a test document and a category with the estimated value for the category enhances the performance of the classifier. This approach provides a text document categorization method that is both efficient and effective. In addition, methods for determining the proper relationship between a set of words in a document and its document categorization is also obtained.


Keywords


Cosine similarity; Information retrievel; Machine learning; Multinomial naïve bayesian; Text documents classifiers;

Full Text:

PDF


DOI: http://doi.org/10.11591/ijai.v13.i1.pp703-710

Refbacks



Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats