Acapella-based music generation with sequential models utilizing discrete cosine transform

Julian Saputra, Agi Prasetiadi, Iqsyahiro Kresna

Abstract


Making musical instruments that accompany vocals in a song depends on the mood quality and the music composer’s creativity. The model created by other researchers has restrictions that include being limited to musical instrument digital interface files and relying on recurrent neural networks (RNN) or Transformers for the recursive generation of musical notes. This research offers the world’s first model capable of automatically generating musical instruments accompanying human vocal sounds. The model we created is divided into three types of sound input: short input, combed input, and frequency sound based on the discrete cosine transform (DCT). By combining the sequential models such as Autoencoder and gated recurrent unit (GRU) models, we will evaluate the performance of the resulting model in terms of loss and creativity. The best model has a performance evaluation that resulted in an average loss of 0.02993620155. The hearing test results from the sound output produced in the frequency range 0-1,600 Hertz can be heard clearly, and the tones are quite harmonious. The model has the potential to be further developed in future research in the field of sound processing.

Keywords


Autoencoder; Discrete cosine transform; Gated recurrent unit; Music instrument; Recurrent neural network

Full Text:

PDF


DOI: http://doi.org/10.11591/ijai.v13.i3.pp3371-3380

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats