From recurrent neural network techniques to pre-trained models: emphasis on the use in Arabic machine translation

Nouhaila Bensalah, Habib Ayad, Abdellah Adib, Abdelhamid Ibn El Farouk

Abstract


In recent years, neural machine translation (NMT) has garnered significant attention due to its superior performance compared to traditional statistical machine translation. However, NMT’s effectiveness can be limited when translating between languages with dissimilar structures, such as English and Arabic. To address this challenge, recent advances in natural language processing (NLP) have introduced unsupervised pre-training of large neural models, showing promise for enhancing various NLP tasks. This paper proposes a solution that leverages unsupervised pre-training of large neural models to enhance Arabic machine translation (MT). Specifically, we utilize pre-trained checkpoints from publicly available Arabic NLP models, like Arabic bidirectional encoder representations from transformers (AraBERT) and Arabic generative pre-trained transformer (AraGPT), to initialize and warm-start the encoder and decoder of our transformer-based sequence-to-sequence model. This approach enables us to incorporate Arabic-specific linguistic knowledge, such as word morphology and context, into the translation process. Through a comprehensive empirical study, we rigorously evaluated our models against commonly used approaches in Arabic MT. Our results demonstrate that our pre-trained models achieve new state-of-the-art performance in Arabic MT. These findings underscore the effectiveness of pre-trained checkpoints in improving Arabic MT, with potential real-world applications.


Keywords


AraBERT; Arabic machine translation; AraGPT; Attention mechanism; Natural language processing; Pre-trained language models; Transformers

Full Text:

PDF


DOI: http://doi.org/10.11591/ijai.v13.i2.pp2403-2412

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats