Modeling sentiment analysis of Indonesian biodiversity policy Tweets using IndoBERTweet
Abstract
This study develops and evaluates a sentiment analysis model using IndoBERTweet to analyze Twitter data on Indonesia’s biodiversity policy. Twitter data focusing on topics such as food security, health, and environmental management were collected, with a representative subset of 13,435 tweets annotated from a larger dataset of 500,000 to ensure reliable sentiment labels through majority voting. IndoBERTweet was compared to seven traditional machine-learning classifiers using TF-IDF and BERT embeddings for feature extraction. Model performance was assessed using mean accuracy, mean F1 score, and statistical significance (p-values). Additionally, sentiment analysis included word attribution techniques with BERT embeddings, enhancing relevance, interpretability, and consistent attribution to deliver accurate insights. IndoBERTweet models consistently outperformed traditional methods in both accuracy and F1 score. While BERT embeddings boosted performance for conventional models, IndoBERTweet delivered superior results, with p-values below 0.05 confirming statistical significance. This approach demonstrates that the model’s outputs are explainable and align with human understanding. Findings underscore IndoBERTweet’s substantial impact on advancing sentiment analysis technology, showcasing its potential to drive innovation and elevate practices in the field.
Keywords
BERT embeddings; Biodiversity policy; IndoBERTweet; Sentiment analysis; Twitter data
Full Text:
PDFDOI: http://doi.org/10.11591/ijai.v14.i3.pp2389-2401
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES).