Unveiling DNA sequences: a comparison of machine learning and deep learning techniques for prediction
Abstract
DNA is the biological macromolecule unit that carries the information of all protein, amino acid sequences. With the help of this protein sequence, we explore the mutated gene and disease-causing mutated genomic pattern. Currently, the progression of genomic innovation is the source of DNA arrangement information developing at a dangerous rate—external factors have stimulated the volume of research into DNA genomes. Initially, the development process of DNA sequencing is accomplished with the support of the Database, data structures, and sequence similarity. The method is capable of extracting a particular property in DNA. We employ the deep learning algorithm to pull out protein sequences' features. The DNA sequence is classified based on the in-build protein structures extracted into the Fasta file. Therefore, the DNA sequence of E. Coli with 106 data sets and 57 nucleotides is tested experimentally. Finally, we compared the results with the existing decision tree algorithm, k-nearest neighbors (KNN)-classification, random forest, and neural networks. The deep learning algorithm yields higher efficiency of 98% compared to other machine learning algorithms. This highlights the potential of deep learning in genomics research and its ability to yield superior results in classifying DNA sequences.
Keywords
Artificial neural network; Decision tree; Fasta; K-nearest neighbors-classification; Random forest
Full Text:
PDFDOI: http://doi.org/10.11591/ijai.v13.i4.pp4583-4593
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).