Vision transformer and hybrid models for Malayalam handwritten word recognition

Anju Arangil Thazhath; Binu Poothakuzhiyil Chacko; Mohamed Basheer Kizhakke Parambath

doi:10.11591/ijai.v15.i3.pp2655-2663

Vision transformer and hybrid models for Malayalam handwritten word recognition

Anju Arangil Thazhath, Binu Poothakuzhiyil Chacko, Mohamed Basheer Kizhakke Parambath

Abstract

Transformer-based architectures and attention mechanisms have revolutionized the field of image recognition. This study focuses on offline handwritten Malayalam word recognition, addressing the lack of publicly available datasets for this low-resource language. A new Malayalam word dataset (MWD) comprising 20,850 samples across 139 classes was developed to support research in this domain. The vision transformer (ViT) was employed for advanced feature extraction, and multiple recognition models—feed-forward neural network (FFNN), global average pooling (GAP), bidirectional long short-term memory (BiLSTM), and attention based feed-forward neural network (AFFNN)—were evaluated. Among these, AFFNN achieved the highest accuracy of 98.56%, establishing the proposed vision transformer-based attention handwritten word recognition (ViTA-HWR) model as a robust framework for handwritten Malayalam word recognition and valuable contribution to regional language processing.

Keywords

Attention mechanism; Feed-forward neural network; Handwritten word recognition; Malayalam word dataset; Vision transformer

Full Text:

PDF

DOI: http://doi.org/10.11591/ijai.v15.i3.pp2655-2663

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES).

View IJAI Stats

Username
Password
Remember me