Hybrid model: IndoBERT and long short-term memory for detecting Indonesian hoax news

Danny Yongky Yefferson, Viriyaputra Lawijaya, Abba Suganda Girsang

Abstract


The world has entered an era that technology has developed far. Due to rapid technological development, information is easily spread. However, not all information spread through social media is factual information. Responding to this social phenomenon, we initiated to create a hoax detection system using the combined method of Indo bidirectional encoder representations from transformers (IndoBERT) and long short-term memory (LSTM). The dataset used in this study are obtained through the process scraping on the site turnbackhoax.id and cable news network (CNN) Indonesia. We decided to use the IndoBERT-LSTM method to detect hoaxes, using IndoBERT as the feature extractor and LSTM as the classification layer can be an effective method because of its advantages in managing and understanding Indonesian language. The results show that the IndoBERT-LSTM model achieved an accuracy of 93.2%, precision of 92%, recall of 89.7%, and F1-score of 90,8%. From a total of 5876 data composed of a total of 1998 factual news and 3878 hoax data. The hoax detection system using IndoBERT-LSTM is a promising approach for detecting hoaxes accurately and efficiently. This model has the potential to make a significant impact in the fight against the spread of Hoaxes.

Keywords


Article; Detection system; Hoax Indobert; Long short-term memory machine learning; News

Full Text:

PDF


DOI: http://doi.org/10.11591/ijai.v13.i2.pp1913-1924

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats