Lip reading using deep learning in Turkish language

Hadi Pourmousa, Üstün Özen

Abstract


Computer vision is one of the most important areas of artificial intelligence and lip reading is one of the most important areas of computer vision. Lip-reading, which is more important in noisy environments or where there is no sound flow, is one of the working areas that can help the hearing-impaired people. There is no dataset in Turkish for lip reading, which there are different datasets at alphabet, word, and sentence level in different languages. The dataset of this study was created by the author and video data were collected from 72 people for 71 words. Audio streams were removed from the collected videos and a dataset was created using only images. Due to the small size of the dataset, the data was replicated with the Camtasia application. After the model of the research was designed and trained, the model was tested on adjectives, nouns, and verbs dataset and success rates of 71.8%, 71.88%, and 79.69% were obtained, respectively.


Keywords


Convolutional neural networks; Dataset; Deep learning; Lip-reading; Turkish language

Full Text:

PDF


DOI: http://doi.org/10.11591/ijai.v13.i3.pp3250-3261

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats