Lt. Thomas Scaria; T. Christopher

Microarray Gene Retrieval System Based on LFDA and SVM

Full Text (PDF, 415KB), PP.9-15

Views: 0 Downloads: 0

Author(s)

Lt. Thomas Scaria ^1,* T. Christopher ²

1. Department of Computer Science, St. Pius X College, Kerala, India

2. Department of Information Technology, Government Arts College, Coimbatore, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2018.01.02

Received: 10 May 2017 / Revised: 11 Jun. 2017 / Accepted: 6 Jul. 2017 / Published: 8 Jan. 2018

Index Terms

DNA microarray technology, feature reduction, SVM classification, LFDA

Abstract

The DNA microarray technology enables the biologists to observe the expressions of multiple thousands of genes in parallel fashion. However, processing and gaining knowledge from the voluminous microarray gene data is serious issue. It is necessary for the biologists to retrieve the required data in a reasonable time. In order to address this issue, this work presents a gene retrieval system, which is based on feature dimensionality minimization and classification of the microarray gene data. The feature dimensionality minimization is achieved by Local Fisher Discriminant Analysis (LFDA), which inherits the merits of both Fisher Discriminant Analysis (FDA) and Locality Preserving Projection (LPP). Support Vector Machine (SVM) is employed as the classifier to classify between the genes. The LFDA is chosen for reducing the dimensionality of the features, owing to its better performance on multimodal data. The SVM is trained with the feature dimensionality reduced microarray gene data, which improves the efficiency and overthrows the computational complexity. The performance of the proposed approach is compared with the LPP and FDA. Additionally, the performance of SVM is compared with the k-Nearest Neighbour (k-NN) classifier. The combination of LFDA and SVM serves better in terms of accuracy, sensitivity and specificity.

Cite This Paper

Lt. Thomas Scaria, T. Christopher, "Microarray Gene Retrieval System Based on LFDA and SVM", International Journal of Intelligent Systems and Applications(IJISA), Vol.10, No.1, pp.9-15, 2018. DOI:10.5815/ijisa.2018.01.02

Reference

[1]H. Chen, "Homeland Security Data Mining Using Social Network Analysis". In: Ortiz-Arroyo D., Larsen H.L., Zeng D.D., Hicks D., Wagner G. (eds) Intelligence and Security Informatics. Lecture Notes in Computer Science, Vol. 5376, pp. 4-4, 2008.
[2]M.M. Babu, "Introduction to microarray data analysis". Computational genomics: Theory and application. Vol. 17, No.6, pp.225-49, 2004.
[3]V. Filkov, S. Skiena and J. Zhi, ‘Analysis Techniques for Microarray Time Series Data’, Journal of Computational Biology, Vol. 9, No. 2, pp. 317-330, 2002.
[4]M. Schena, D. Shalon, R.W. Davis, and P. O. Brown, “Quantitative monitoring of gene expression patterns with a complementary DNA microarray,” Science, Vol. 270, No. 5235, pp. 467–470, 1995.
[5]J. L.DeRisi, V.R. Iyer, and P.O.Brown, “Exploring themetabolic and genetic control of gene expression on a genomic scale,” Science, vol. 278, no. 5338, pp. 680–686, 1997.
[6]“The chipping forecast I,” Supplement to Nature Genetics, vol. 21, no. 1, 1999.
[7]“The chipping forecast II,” Supplement to Nature Genetics, vol. 32, 2002.
[8]D.P. Berrar, W. Dubitzky, M. Granzow, editors. A practical approach to microarray data analysis. Boston, Mass, USA: Kluwer academic publishers; 2003.
[9]I. Slavkov, S. Džeroski, J. Struyf, S. Loskovska. "Constrained clustering of gene expression profiles". In : Proc. of the Conference on Data Mining and Data Warehouses at the Seventh International Multi-Conference on Information Society, pp. 212-215, 2005.
[10]Y.Y. Leung, Y.S. Hung, "An integrated approach to feature selection and classification for microarray data with outlier detection, In: Proc. of 8th Annual International Conference on Computational Systems Bioinformatics, Aug. 10-12, Stanford, CA, USA, pp. 1-4, 2009.
[11]A. Osareh and B. Shadgar, "Classification and Diagnostic Prediction of Cancers using Gene Microarray Data Analysis", Journal of Applied Sciences, Vol. 9, No. 3, 459-468, 2009.
[12]M. Sugiyama, "Dimensionality reduction of multimodal labeled data by local Fisher discriminant analysis". Journal of Machine Learning Research, Vol.8, pp.1027–1061, 2007.
[13]J. Shawe-Taylor, and Cristianini N. "Kernel methods for pattern analysis". Cambridge university press, 2004.
[14]G. Piatetsky-Shapiro, P. Tamayo, Microarray data mining: facing the challenges, ACM SIGKDD Explor. Newsl. Vol. 5, No. 2,pp. 1-5, 2003.
[15]T. Golub, D. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. Mesirov, H. Coller, M. Loh, J. Downing, M. Caligiuri, et al, "Molecular classification of cancer: class discovery and class prediction by gene expression monitoring", Science, Vol. 286, No.5439, 531–537, 1999.
[16]A. Jain, D. Zongker, Feature selection: evaluation, application, and small sample performance, IEEE Trans. Pattern Anal. Mach. Intell. Vol.19, No. 2, pp.153–158, 1997.
[17]I. Guyon, S. Gunn, M. Nikravesh, L.A. Zadeh, editors. "Feature extraction: foundations and applications". Springer, Vol. 207, 2008.
[18]V. Bolón-Canedo, N. Sánchez-Maroño, A. Alonso-Betanzos, J.M. Benítez, F. Herrera, "A review of microarray datasets and applied feature selection methods", Information Sciences, vol.282, pp.111-135, 2014.
[19]R. A. Fisher. "The use of multiple measurements in taxonomic problems". Annals of Eugenics, Vol. 7, No. 2, pp. 179–188, 1936.
[20]K. Fukunaga. "Introduction to Statistical Pattern Recognition". Academic Press, Inc., Boston, second edition, 1990.
[21]X. He and P. Niyogi. "Locality preserving projections". In S. Thrun, L. Saul, and B. Sch ¨olkopf, editors, Advances in Neural Information Processing Systems 16. MIT Press, Cambridge, MA,2004.
[22]S Alagukumar, R Lawrance, "Classification of microarray gene expression data using associative classification", International Conference on Computing Technologies and Intelligent Data Engineering, 7-9 Jan, Kovilpatti, 2016.
[23]Elnaz Pashaei, Mustafa Ozen, Nizamettin Aydin, "Gene selection and classification approach for microarray data based on Random Forest Ranking and BBHA", IEEE-EMBS International Conference on Biomedical and Health Informatics, 24-27 Feb, Las Vegas, USA, 2016.
[24]S. Dehuri and S. Cho, “Multiobjective Classification Rule mining Using Gene Expression Programming”, In : Proc of the International Conference on convergence and Hybrid Information Technology, 11-13 November, Busan, Vol. 2, pp. 754-760, 2008.
[25]G.E. Hinton and R.R. Salakhutdinov, “Reducing the dimensionality of data with neural networks”, Science, Vol. 313, pp. 504–507, 2006.
[26]M. Sugiyama, M. Krauledat and K.R. Muller. “Covariate shift adaptation by importance weighted cross validation”. Journal of Machine Learning Research, Vol. 8, pp. 985–1005, 2007.
[27]http://ligarto.org/rdiaz/Papers/rfVS/.

International Journal of Intelligent Systems and Applications (IJISA)