Hindi spoken digit analysis for native and non-native speakers

Parabattina Bhagath; Malempati Shanmukha; Pradip K. Das

doi:10.11591/ijai.v14.i2.pp1561-1567

Hindi spoken digit analysis for native and non-native speakers

Parabattina Bhagath, Malempati Shanmukha, Pradip K. Das

Abstract

Automated speech recognition (ASR) is the process of using an algorithm or
automated system to recognize and translate spoken words of a specific language. ASR has various applications in fields such as mobile speech recognition, the internet of things and human-machine interaction. Researchers have been working on issues related to ASR for more than 60 years. One of the many use cases of ASR is designing applications such as digit recognition that aid differently-abled individuals, children and elderly people. However, there is a lack of spoken language data in under-developed and low-resourced languages, which presents difficulties. Although this is not a pivotal issue for highly established languages like English, it has a significant impact on less commonly spoken languages. In this paper, we discuss the development of a Hindi-spoken dataset and benchmark spoken digit models using convolutional neural networks (CNNs). The dataset includes both native and non-native Hindi speakers. The models built using CNN exhibit 88.44%, 95.15%, and 89.41% for non-native, native, and combined speakers respectively.

Keywords

Convolutional neural networks; Digit recognition; Hindi speech; Mel frequency cepstral coefficients; Under-resourced speech recognition;

Full Text:

PDF

DOI: http://doi.org/10.11591/ijai.v14.i2.pp1561-1567

Refbacks

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES).

View IJAI Stats

Username
Password
Remember me