Spark-powered bioactivity prediction: a comparison of machine learning approaches
Abstract
The arduous and expensive nature of drug discovery has long been a bottleneck in scientific progress. However, recent breakthroughs in computational power, notably machine learning (ML) and artificial intelligence (AI), are profoundly transforming the field. Automated machine learning (AutoML) presents itself as a significant advancement, streamlining model selection, and hyperparameter tuning. This study delves into the potential of AutoML to accelerate drug discovery by comparing it to classical ML techniques. The focus lies on predicting the bioactivity of epidermal growth factor receptor (EGFR), a critical protein implicated in many cancers. By utilizing the scalability of Apache Spark, vast and diverse datasets encompassing biological, chemical, and genomic data tied to EGFR are processed. This comparative analysis aims to evaluate the comparative performance of both approaches, thereby contributing actionable insights to drug discovery research.
Keywords
Big data; Epidermal growth factor receptor cancer; Machine learning; Spark machine learning library; Target bioactivity
Full Text:
PDFDOI: http://doi.org/10.11591/ijai.v15.i3.pp2423-2430
Refbacks
- There are currently no refbacks.
Copyright (c) 2026 Nazif Tchagafo, Abderrahmane Ez-Zahout, Belaid Ahiod

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES).