Hybrid methods to identify ovarian cancer from imbalanced high-dimensional microarray data
Abstract
Scientists have used microarray data to identify healthy people and patients with various types of cancer, including ovarian cancer. Ovarian cancer is the most dangerous of all types of cancer that attacks the female reproductive organ. The right combination of methods is needed to identify ovarian cancer from microarray data because that type of data is high-dimensional and imbalanced. This research aims to propose two hybrid methods which are a combination of infinite feature selection (IFS) as features selector with classification and regression tree (CART) as a classifier. IFS can work with two separate scenarios, namely supervised infinite feature selection (SIFS) and unsupervised infinite feature selection (UIFS). This research also compares the performance of the two hybrid methods proposed (SIFS-CART and UIFS-CART) with CART without IFS. The data used is OVA_ovary that has 10937 columns and 1545 rows. The results shows that SIFS-CART achieves maximum performance using 1000 features and UIFS-CART 5000 features. CART without IFS uses all 10935 features. The balanced accuracy results show SIFS-CART can outperform CART without IFS and UIFS-CART. Using less features to get highest balanced accuracy results, SIFS is more effective in performing feature selection on the OVA_ovary dataset compared to UIFS.
Keywords
Full Text:
PDFDOI: http://doi.org/10.11591/ijai.v14.i2.pp1173-1182
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).