Performance of multivariate mutual information and autocorrelation encoding methods for the prediction of protein-protein interactions

Alhadi Bustamam, Mohamad Irlin Sunggawa, Titin Siswantining


Protein interactions play an essential role in the study of how an organism can be infected with a disease and also its effects. One of the challenges in computational methods in the prediction of protein-protein interactions is how to represent a sequence of amino acids in a vector so that it can be used in machine learning to create a model that can predict whether or not an interaction occurs in a protein pair. This paper examined the qualitative feature encoding methods of amino acid sequence, namely, multivariate mutual information (MMI), and the quantitative feature encoding methods, namely, autocorrelation. We develop the new design for MMI and autocorrelation feature encoding methods which give better results than the previous research. There are four ways to build the MMI method and six ways to build the autocorrelation method that we tested. We also built four types of MMI-autocorrelation (mixed) method and look for the best form of each type of MMI, autocorrelation, and mixed-method. We combine these feature encoding methods with support vector machine (SVM) as machine learning methods. We also test the encoding methods we propose to several machine learning classifier methods, such as random forest (RF), k-nearest neighbor (KNN), and gradient boosting.


Autocorrelation; Machine learning; Multivariate mutual information; Protein-protein interactions; Support vector machine

Full Text:




  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats