Hate speech detection on Indonesian text using word embedding method-global vector

Mardhiya Hayaty; Arif Dwi Laksito; Sumarni Adi

doi:10.11591/ijai.v12.i4.pp1928-1937

Hate speech detection on Indonesian text using word embedding method-global vector

Mardhiya Hayaty, Arif Dwi Laksito, Sumarni Adi

Abstract

Hate speech is defined as communication directed toward a specific individual or group that involves hatred or anger and a language with solid arguments leading to someone's opinion can cause social conflict. It has a lot of potential for individuals to communicate their thoughts on an online platform because the number of Internet users globally, including in Indonesia, is continually rising. This study aims to observe the impact of pre-trained global vector (GloVe) word embedding on accuracy in the classification of hate speech and non-hate speech. The use of pre-trained GloVe (Indonesian text) and single and multi-layer long short-term memory (LSTM) classifiers has performance that is resistant to overfitting compared to pre-trainable embedding for hatespeech detection. The accuracy value is 81.5% on a single layer and 80.9% on a double-layer LSTM. The following job is to provide pre-trained with formal and non-formal language corpus; pre-processing to overcome non-formal words is very challenging.

Keywords

Abusive; Glove; Hate speech; Long short-term memory; Word embedding

Full Text:

PDF

DOI: http://doi.org/10.11591/ijai.v12.i4.pp1928-1937

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats

Username
Password
Remember me