Enhanced framework for detecting Vietnamese hate and offensive spans
Abstract
The rise of hate and offensive content on social media platforms, such as Facebook and Twitter, has emerged as an escalating concern, especially in Vietnam. Consequently, detecting hate and offensive spans in Vietnamese text is an essential area of research. This study introduces ViHateOff, an advanced framework that combines a hated speech dictionary (HSD) automatically constructed from the Vietnamese hate and offensive spans (ViHOS) dataset with the pre-trained language models for Vietnamese (PhoBERT)-large language model to enhance the detection of offensive expressions. The framework functions through two primary modules. First, it constructs an HSD from the ViHOS dataset, which serves as a reference for identifying hate and offensive language in Vietnamese text. Second, the framework integrates the PhoBERT-large language model with HSD, enhancing the detection of harmful words in the input text. Experimental results demonstrate that the proposed framework significantly outperforms existing state-of-the-art (SOTA), achieving an F1-score of 0.8693 on the all spans subset and 0.8709 on the multiple-spans subset representing relative improvements of over 10% compared to the strongest baseline.
Keywords
Hate speech detection; Hated speech dictionary; Natural language processing; Offensive language; Social media; Vietnamese text
Full Text:
PDFDOI: http://doi.org/10.11591/ijai.v15.i1.pp962-971
Refbacks
- There are currently no refbacks.
Copyright (c) 2026 Dinh-Hong Vu, Tuong Le

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES).