Automatic speech recognition for Indonesian medical dictation in cloud environment

Asril Jarin, Agung Santosa, Mohammad Teduh Uliniansyah, Lyla Ruslana Aini, Elvira Nurfadhilah, Gunarso Gunarso


This paper introduces SPWPM, an automatic speech recognition (ASR) system designed specifically for Indonesian medical dictation. The main objective of SPWPM is to assist medical professionals in producing medical reports and diagnosing patients. Deployed within a cloud computing service architecture, SPWPM strives to achieve a minimum speech recognition accuracy of 95%. The ASR model of SPWPM is developed using Kaldi and PyChain technologies—creating a comprehensive training dataset involving collaboration with PT Dua-Empat-Tujuh and Harapan Kita Hospital. Several optimization techniques were applied, including language modeling with smoothing, lexicon generation using the Grapheme-to-Phoneme Converter, and data augmentation. The readiness of this technology to assist hospital users was assessed through two evaluations: the SPWPM architecture test and the SPWPM speech recognition test. The results demonstrate the system's preparedness in accurately transcribing medical dictation, showcasing its potential to enhance medical reporting for healthcare professionals in hospital environments.


Medical Dictation; Speech Recognition; Kaldi ASR; PyChain; Indonesia Speech corpus

Full Text:




  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats