Discriminative deep learning based hybrid spectro-temporal features for synthetic voice spoofing detection

Pranita Niraj Palsapure, Rajeswari Rajeswari, Sandeep Kumar Kempegowda, Kumbhar Trupti Ravikumar

Abstract


Voice-based systems like speaker identification systems (SIS) and automatic speaker verification systems (ASV) are proliferating across industries such as finance and healthcare due to their utility in identity verification through unique speech pattern analysis. Despite their advancements, ASVs are susceptible to various spoofing attacks, including logical and replay attacks, posing challenges due to the sophisticated acoustic distinctions between authentic and spoofed voices. To counteract, this study proposes a robust yet computationally efficient countermeasure system, utilizing a systematic data processing pipeline coupled with a hybrid spectral-temporal learning approach. The aim is to identify effective features that optimize the model's detection accuracy and computational efficiency. The model achieved superior performance with an accuracy of 99.44% and an equal error rate (EER) of 0.014 in the logical access scenario of the ASVspoof 2019 challenge, demonstrating its enhanced accuracy and reliability in detecting spoofing attacks with minimized error margin. 

Keywords


Automatic speaker verification; Hybrid feature learning; LSTM-CNN; Spectral-temporal feature; Spoofing attack detection

Full Text:

PDF


DOI: http://doi.org/10.11591/ijai.v14.i1.pp130-141

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats