Machine learning-based technique for big data sentiments extraction

Noraini Seman, Nurul Atiqah Razmi


A huge amount of data is generated every minute for social networking and content sharing via Social media sites that can be in a form of structured, unstructured or semi-structured data.  One of the largest used social media sites is Twitter, where each and every day millions of data generated in the form of unstructured tweets. Tweets or opinions of the people can be used to extract sentiments of the people. Sentiment analysis is beneficial for organizations to improve their products and make required changes on demand to increase their profit. In this paper, three machine learning algorithms Support Vector Machine (SVM), Decision Trees (DT), and Naive Bayes (NB) for classifying sentiments of twitters data. The purpose of this research is to compare the outcomes of these algorithms to identify best machine learning method which gives most accurate and efficient results for classifying twitter data. Our experimental result shows that same preprocessing methods on a different dataset affect similarly the classifiers performance. After analyzing the results it is observed that SVM provides 64.96%, 71.26% and 91.25% precision which is better than other two algorithms. Also, overall Recall and F-measure rate of SVM is greater than NB and DT for three datasets. However, it is important to further study current available preprocessing techniques that help us to improve results of various classifiers.


Classifier, F-measure, Machine learning, Recall, Sentiment analysis

Full Text:




  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats