Enhancing the performance of cancer text classification model based on cancer hallmarks

Noha Ali, Ahmed H. AbuEl-Atta, Hala H. Zayed


Deep learning (DL) algorithms achieved state-of-the-art performance in computer vision, speech recognition, and natural language processing (NLP). In this paper, we enhance the convolutional neural network (CNN) algorithm to classify cancer articles according to cancer hallmarks. The model implements a recent word embedding technique in the embedding layer. This technique uses the concept of distributed phrase representation and multi-word phrases embedding. The proposed model enhances the performance of the existing model used for biomedical text classification. The result of the proposed model overcomes the previous model by achieving an F-score equal to 83.87% using an unsupervised technique that trained on PubMed abstracts called PMC vectors (PMCVec) embedding. Also, we made another experiment on the same dataset using the recurrent neural network (RNN) algorithm with two different word embeddings Google news and PMCVec which achieving F-score equal to 74.9% and 76.26%, respectively.


Biomedical text classification cancer hallmarks; CNN; Deep learning; NLP; Phrase embedding; PMCVec; RNN

Full Text:


DOI: http://doi.org/10.11591/ijai.v10.i2.pp316-323


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats