Effects of kernels and the proportion of training data on the accuracy of SVM sentiment analysis in lecturer evaluation

Daniel Febrian Sengkey, Agustinus Jacobus, Fabian Johanes Manoppo

Abstract


Support Vector Machine (SVM) is a known method for supervised learning in sentiment analysis and there are many studies about the use of SVM in classifying the sentiments in lecturer evaluation. SVM has various parameters that can be tuned and kernels that can be chosen to improve the classifier accuracy. However, not all options have been explored.
Therefore, in this study we compared the four SVM kernels: radial, linear, polynomial, and sigmoid, to discover how each kernel influences the accuracy of the classifier.To make a proper assessment, we used our labeled dataset of students’ evaluations toward the lecturer. The dataset was split, one for training the classifier, and another one for testing the model. As an addition, we also used several different ratios of the training:testing dataset. The split ratios are 0.5 to 0.95, with the increment factor of 0.05. The dataset was split randomly, hence the splitting-training-testing processes were repeated 1,000 times for each kernel and splitting ratio. Therefore, at the end of the experiment, we got 40,000 accuracy data. Later, we applied statistical methods to see whether the differences are significant.Based on the statistical test, we found that in this particular case, the linear kernel significantly has higher accuracy compared to the other kernels. However, there is a tradeoff, where the results are getting more varied with a higher proportion of data used for training.

Keywords


Kernels comparison, Lecturer evaluation, Proportion of training data student opinion, Sentiment analysis, Support vector machine



DOI: http://doi.org/10.11591/ijai.v9.i4.pp%25p
Total views : 40 times

Refbacks

  • There are currently no refbacks.


View IJAI Stats

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.