Investigation on low-performance tuned-regressor of inhibitory concentration targeting the SARS-CoV-2 polyprotein 1ab
Abstract
Hyperparameter tuning is a key optimization strategy in machine learning (ML), often used with GridSearchCV to find optimal hyperparameter combinations. This study aimed to predict the half-maximal inhibitory concentration (IC50) of small molecules targeting the SARS-CoV-2 replicase polyprotein 1ab (pp1ab) by optimizing three ML algorithms: histogram gradient boosting regressor (HGBR), light gradient boosting regressor (LGBR), and random forest regressor (RFR). Bioactivity data, including duplicates, were processed using three approaches: untreated, aggregation of quantitative bioactivity, and duplicate removal. Molecular features were encoded using twelve types of molecular fingerprints. To optimize the models, hyperparameter tuning with GridSearchCV was applied across a broad parameter space. The results showed that the performance of the models was inconsistent, despite comprehensive hyperparameter tuning. Further analysis showed that the distribution of Murcko fragments was uneven between the training and testing datasets. Key fragments were underrepresented in the testing phase, leading to a mismatch in model predictions. The study demonstrates that hyperparameter tuning alone may not be sufficient to achieve high predictive performance when the distribution of molecular fragments is unbalanced between training and testing datasets. Ensuring fragment diversity across datasets is crucial for improving model reliability in drug discovery applications.
Keywords
Hyperparameter tuning; Inhibitory concentration 50; Murcko fragments; Quantitative structure-activity relationship; SARS-CoV-2 polyprotein 1ab
Full Text:
PDFDOI: http://doi.org/10.11591/ijai.v14.i4.pp3003-3013
Refbacks
- There are currently no refbacks.
Copyright (c) 2025 Institute of Advanced Engineering and Science
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES).