The Effect of Evolutionary Algorithm in Gene Subset Selection for Cancer Classification

Full Text (PDF, 250KB), PP.60-66

Views: 0 Downloads: 0


M.N.F. Fajila 1,* M.A.C. Akmal Jahan 1

1. Department of Mathematical Sciences, Faculty of Applied Sciences, South Eastern University of Sri Lanka, Sri Lanka

* Corresponding author.


Received: 1 Apr. 2018 / Revised: 26 Apr. 2018 / Accepted: 20 May 2018 / Published: 8 Jul. 2018

Index Terms

Evolutionary Algorithm, Filters, Gene Subset, Microarray, Wrappers


The fact that reflects the cancer research consequences shows that still there are improvements that should be investigated in the stream of cancer in future. This leads the researchers to actively involve further in cancer research field. As an invention, a hybrid machine learning method is proposed in this study where two filters are assessed along with a wrapper approach. Typically, filters prioritize the features while, wrappers contribute in subset identification. Though both filters and wrappers exist independently, the excellent results they produce when applied subsequently. The wrapper-filter combination plays a major role in feature selection. Yet, incorporating with a best strategy for feature space analysis is crucial in this concern. Thus, we introduce the Evolutionary Algorithm in the proposed study to search through the feature space for informative gene subset selection. Though there are several gene selection approaches for cancer classification, many of them suffer from law classification accuracy and huge gene subset for prediction. Hence, we propose Evolutionary Algorithm to overcome this problem. The proposed approach is evaluated on five microarray datasets, where three out of them provide 100% accuracy. Regardless the number of genes selected, both filters provide the same performance throughout the datasets used. As a consequence, the Evolutionary Algorithm in feature space search is highlighted for its performance in gene subset selection.

Cite This Paper

M.N.F. Fajila, M.A.C. Akmal Jahan, " The Effect of Evolutionary Algorithm in Gene Subset Selection for Cancer Classification", International Journal of Modern Education and Computer Science(IJMECS), Vol.10, No.7, pp. 60-66, 2018. DOI:10.5815/ijmecs.2018.07.06


[1]I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, “Gene selection for cancer classification using support vector machines,” Machine learning, vol. 46, no. 1, pp. 389–422, 2002.
[2]C. Kim, H. Li, S.-Y. Shin, and K.-B. Hwang, “An efficient and effective wrapper based on paired t-test for learning naive bayes classifiers from large-scale domains,” Procedia Computer Science, vol. 23, pp. 102–112, 2013.
[3]B. Gan, C.-H. Zheng, J. Zhang, and H.-Q. Wang, “Sparse representation for tumor classification based on feature extraction using latent low-rank representation,” BioMed research international, vol. 2014, 2014.
[4]R. Aziz, C. Verma, and N. Srivastava, “A fuzzy based feature selection from independent component subspace for machine learning classification of microarray data,” Genomics data, vol. 8, pp. 4–15, 2016.
[5]T. Golub, D. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J.H.H.C. Mesirov, M. Loh, J. Downing, M. Caligiuri, C. Bloomfield, and E. Lander, “Molecular classification of cancer: class discovery and class prediction by gene expression monitoring”. Science, 286 (5439), pp. 531–537, Oct. 1999.
[6]T.S. Furey, N. Cristianini, N. Duffy, D.W. Bednarski, M. Schummer, D. Haussler, “Support vector machine classification and validation of cancer tissue samples using microarray expression data”. Bioinformatics, vol. 16, pp. 906–914, Oct. 2000.
[7]I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, “Gene selection for cancer classification using support vector machines,” Machine learning, vol. 46, no. 1, pp. 389–422, 2002.
[8]Y. Wang, I. V. Tetko, M. A. Hall and E. Frank, “Gene selection from microarray data for cancer classification: A machine learning approach”, Comput Biol Chem, vol. 29, pp. 37-46, Feb. 2005.
[9]Zexuan Zhu, Y. S. Ong and M. Dash, “Markov Blanket-Embedded Genetic Algorithm for Gene Selection”, Pattern Recognition, Vol. 49, No. 11, 3236-3248, 2007.
[10]M. Sun, L. Xiong, H. Sun, and D. Jiang, “A ga-based feature selection for high-dimensional data clustering,” in Genetic and Evolutionary Computing, 2009. WGEC’09. 3rd International Conference on. IEEE, 2009, pp. 769–772.
[11]S. H. Bouazza, K. Auhmani, A. Zeroual, and N. Hamdi, “Selecting significant marker genes from microarray data by filter approach for cancer diagnosis,” Procedia Computer Science, vol. 127, pp. 300–309, 2018.
[12]L. Gao, M. Ye, X. Lu, and D. Huang, “Hybrid method based on information gain and support vector machine for gene selection in cancer classification,” Genomics, proteomics & bioinformatics, vol. 15, no. 6, pp. 389–395, 2017.
[13]M. N. F. Fajila and R. D. Nawarathna, “New feature selection method for high dimensional gene data,” in Symposium on Statistical & Computational Modelling With Applications, Department of Statistics & Computer Science, University of Kelaniya, Sri Lanka, Nov. 2016, pp. 67–70.
[14]C. De Stefano, F. Fontanella, and A. S. di Freca, “Feature selection in high dimensional data by a filter-based genetic algorithm,” in European Conference on the Applications of Evolutionary Computation. Springer, 2017, pp. 506–521.
[15]C.-S. Yang, L.-Y. Chuang, C.-H. Ke, and C.-H. Yang, “A hybrid feature selection method for microarray classification.” IAENG International Journal of Computer Science, vol. 35, no. 3, 2008.
[16]K. DAS and D. MISHRA, “Hybridized univariate and multivariate filter based approaches for gene selection,” International Journal of Pharma and Bio Sciences, vol. 7, 2016.
[17]P. A. Mundra and J. C. Rajapakse, “Svm-rfe with mrmr filter for gene selection,” IEEE transactions on nanobioscience, vol. 9, no. 1, pp. 31–37, 2010.
[18]C. A. Kumar, M. Sooraj, and S. Ramakrishnan, “A comparative performance evaluation of supervised feature selection algorithms on microarray datasets,” Procedia Computer Science, vol. 115, pp. 209–217, 2017.
[19]E. Alba, J. Garcia-Nieto, L. Jourdan, and E.-G. Talbi, “Gene selection in cancer classification using pso/svm and ga/svm hybrid algorithms,” in Evolutionary Computation, 2007. CEC 2007. IEEE Congress on. IEEE, 2007, pp. 284–290.
[20]H. Motieghader, A. Najafi, B. Sadeghi, and A. Masoudi-Nejad, “A hybrid gene selection algorithm for microarray cancer classification using genetic algorithm and learning automata,” Informatics in Medicine Unlocked, vol. 9, pp. 246–254, 2017.
[21]F. Jimenez, G. Sanchez, J. M. Garcia, G. Sciavicco, and L. Miralles, “Multi-objective evolutionary feature selection for online sales forecasting,” Neurocomputing, vol. 234, pp. 75–92, 2017.
[22]W. Xiong and C. Wang, “A hybrid improved ant colony optimization and random forests feature selection method for microarray data,” in Proc. International Conference on Networked Computing and Advanced Information Management, pp. 559–563, 2009.
[23]B. Sahu and D. Mishra, “A novel feature selection algorithm using particle swarm optimization for cancer microarray data,” Procedia Engineering, vol. 38, pp. 27–31, 2012.
[24]H. M. Alshamlan, “Co-abc: Correlation artificial bee colony algorithm for biomarker gene discovery using gene expression profile,” Saudi Journal of Biological Sciences, 2018.
[25]Alshamlan, H.M., Badr, G.H., Alohali, Y.A., 2016. Abc-svm: artificial bee colony and svm method for microarray gene selection and multi class cancer classification. Int. J. Mach. Learn. Comput. 6 (3), 184.
[26]H. Alshamlan, G. Badr, and Y. Alohali, “mrmr-abc: a hybrid gene selection algorithm for cancer classification using microarray gene expression profiling,” BioMed research international, vol. 2015, 2015.
[27]M. Toghraee, H. Parvin, and F. Rad, “The impact of feature selection on meta-heuristic algorithms to data mining methods,” International Journal of Modern Education and Computer Science(IJMECS), vol. 8, no. 10, p. 33, 2016. DOI: 10.5815/ijmecs.2016.10.05
[28]H. M. Alshamaln, “Dqb: a novel dynamic quantitive classification model using artificial bee colony algorithm with application on gene expression profiles,” Saudi Journal of Biological Sciences, 2018.
[29]H. Vural and A. Subas¸ı, “Data-mining techniques to classify microarray gene expression data using gene selection by svd and information gain,” Model Artificial Intel, vol. 6, pp. 171–182, 2015.
[30]M. Mramor, G. Leban, J. Demˇsar, and B. Zupan, “Visualization-based cancer microarray data classification analysis,” Bioinformatics, vol. 23, no. 16, pp. 2147–2154, 2007.
[31]A. Sharma, S. Imoto, and S. Miyano, “A top-r feature selection algorithm for microarray gene expression data,” IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), vol. 9, no. 3, pp. 754–764, 2012.
[32]T. Li, C. Zhang, and M. Ogihara, “A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression,” Bioinformatics, vol. 20, no. 15, pp. 2429–2437, 2004.
[33]M. Panda, “Elephant search optimization combined with deep neural network for microarray data analysis,” Journal of King Saud University-Computer and Information Sciences, 2017.
[34]S. Roy and S. S. Chaudhuri, “Cuckoo search algorithm using levy flight: a review,” ´ International Journal of Modern Education and Computer Science, vol. 5, no. 12, p. 10, 2013.
[35]S. Roy and S. S. Chaudhuri, “Bio-inspired ant algorithms: A review,” International Journal of Modern Education and Computer Science, vol. 5, no. 4, p. 25, 2013.
[36]Sergii Babichev, Jiří Škvor, Jiří Fišer, Volodymyr Lytvynenko, "Technology of Gene Expression Profiles Filtering Based on Wavelet Analysis", International Journal of Intelligent Systems and Applications(IJISA), Vol.10, No.4, pp.1-7, 2018. DOI: 10.5815/ijisa.2018.04.01