Reinforcement of low-resource language translation with neural machine translation and backtranslation synergies

Padma Prasada, Malode Vishwanatha Panduranga Rao

Abstract


This research investigates challenges and advancements in neural machine translation (NMT), specifically targeting English-to-Kannada translation. Emphasizing the scarcity of data and linguistic complexity in low-resource languages (LRL), particularly Kannada, the study underscores the need for specialized techniques. Starting with exploration of Kannada's historical and cultural significance, the paper highlights critical importance of linguistic comprehension. The primary objective is to develop robust NMT models for precise and contextually relevant translations in low-resource scenarios. The novelty of this research lies in its innovative approach to Kannada NMT challenges, incorporating comprehensive examination of historical and cultural context to establish strong linguistic foundation. Motivated by the urgency to address translation needs in LRL, the paper proposes novel strategies, advocating notably for backtranslation to generate synthetic parallel corpora. Rigorous testing, including bilingual evaluation understudy (BLEU) score assessments, evaluates effectiveness of these proposed approaches. Beyond assessing backtranslation, the study explores challenges faced by Kannada NMT in handling dialectical and spelling variations. The research reports substantial 83-percentage-point average increase in BLEU scores, contingent on aligning unique Kannada terms with the same domain as existing occurrences. This study contributes significantly to Kannada natural language processing by offering novel insights into NMT intricacies and providing practical solutions for enhancing translation accuracy in low-resource settings.


Keywords


Backtranslation; Low resource language; Neural machine translation; NLP data augmentaation; Pretrained language models

Full Text:

PDF


DOI: http://doi.org/10.11591/ijai.v13.i3.pp3478-3488

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats