CRNN model for text detection and classification from natural scenes

Puneeth Prakash, Sharath Kumar Yeliyur Hanumanthaiah, Somashekhar Bannur Mayigowda


In the emerging field of computer vision, text recognition in natural settings remains a significant challenge due to variables like font, text size, and background complexity. This study introduces a method focusing on the automatic detection and classification of cursive text in multiple languages: English, Hindi, Tamil, and Kannada using a deep convolutional recurrent neural network (CRNN). The architecture combines convolutional neural networks (CNN) and long short-term memory (LSTM) networks for effective spatial and temporal learning. We employed pre-trained CNN models like VGG-16 and ResNet-18 for feature extraction and evaluated their performance. The method outperformed existing techniques, achieving an accuracy of 95.0%, 96.3%, and 96.2% on ICDAR 2015, ICDAR 2017, and a custom dataset (PDT2023), respectively. The findings not only push the boundaries of text detection technology but also offer promising prospects for practical applications.


Scene text detection; Natural scene segmentation; Ensemble learning; PDT2023; ICDAR datasets;

Full Text:




Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats