The Identification of Human Cassette Exons based on SVM

Full Text (PDF, 192KB), PP.57-64

Views: 0 Downloads: 0


Lan Tao 1 Yanmeng Xu 1,* Huakui Chen 1 Zexuan Zhu 1

1. Shenzhen University, Shenzhen, P.R. China

* Corresponding author.


Received: 2 Nov. 2010 / Revised: 1 Dec. 2010 / Accepted: 10 Jan. 2011 / Published: 8 Feb. 2011

Index Terms

Cassette exon, Constitutive exon, Support vector machine (SVM), Sequence features


Alternative splicing is the main mechanism expanding transcript diversity. Cassette exon is an important alternative splicing form that is very similar to constitutive exon in sequence features. Previous studies which based on expressed sequence tags (ESTs) and evolutionary conservation information have identified many alternative splicing events. In this paper, we construct a classifier to identify the human cassette exons based on Support vector machine (SVM) which only make use of sequence information. It can achieve the accuracy of 68.12%. Especially, the classifier can achieve 76.54% when considering the splicing frequency. The results show that the sensitivity and specificity of this method are higher than those recently reported on the same dataset.

Cite This Paper

Lan Tao, Yanmeng Xu, Huakui Chen, Zexuan Zhu,"The Identification of Human Cassette Exons based on SVM", IJEM, vol.1, no.1, pp.57-64, 2011. DOI: 10.5815/ijem.2011.01.09 


[1] W. R. T. Yang, and Q. Z Li, “Prediction of Alternative 5’/3’ Splice Sites in the Human Genome,” BioMedical Engineering and Informatics,2008. vol. 1, pp. 143-147.

[2] C. Ma, F. Y. Deng, H. Liu, and Y. H. Zhou, “Accurate prediction of alternatively spliced cassette exons using evolutionary conservation information and logitilinear model,” Bioinformatics, 2009, pp. 131-134.

[3] G. Dror, R. Sorek, and R. Shamir, “Accurate identification of alternatively spliced exons using support vector machine,” Bioinformatics, 2005, vol. 21, pp. 897-901.

[4] R. Sinha, M. Hiller, R. Pudimat, U. Gausmann, M. Platzer, and R. Backofen, “Improved identification of conserved cassette exons using Bayesian networks,” BMC Bioinformatics, 2008, 9:477.

[5] Y. W. Chiu, F. R. Hsu, and M. K. Shan, “Comparative Analysis of Exon Skipping Patterns in Human and Mouse,” Database and Expert Systems Applications, 2006, pp.223-2287.

[6] W. R. T. Yang, “Prediction of alternative splice site and exon skipping based on sequence information,” Inner Mongolia: Engineering College of Inner Mongolia University, 2008, pp. 55-63.

[7] G. Su, Y. F. Sun, J. Li, “The identification of human cryptic exons based on SVM,” Bioinformatics and Biomedical Engineering, 2009, pp, 1-4.

[8] Y. Q. Xing, L. R. Zhang, L. F. Luo, “Prediction of alternative splicing sites of cassette exons and intron retention in human genome,” ACTA BIOPHYSICA SINICA, 2008, vol. 24, pp. 393-401.

[9] S. Stamm, J. J. Riethoven, L. Texier, et al, “ASD: a bioinformatics resource on alternative splicing,” Nucleic Acids Res, 2006, pp. 46-55.

[10] C. C. Chang, C. J. Lin, “LIBSVM: a library for support vector machines,” 2003,, unpublished.

[11] C. Yan, Z. Z. Wang, “Research on signal sequences analysis and related characters of gene splicing,” Graduate School of National University, 2006, pp. 101-115.

[12] G. E. Crooks, G. Hon, J. M. Chandonia, et al, “Weblogo: a sequence log generator,” Genome Res, 2004, pp. 1188-1190.

[13] C. Zhang, M. L. Hastings, A. R. Krainer, and M. Q. Zhang, “Dual-specificity splice sites function alternatively as 5’ and 3’ splice sites,” Proc. Natl. Acad. Sci. USA, vol. 104, 2007, pp. 15028-15033.

[14] S. H. Zhao, J. Kim, and S. Heber, " Analysis of cis-regulatory Motifs in cassette exons by incorporating exon skipping rates,” ISBRA, 2009, pp.272-283.

[15] S. Q. Song, X. P. Chen, “Comparative component analysis of exons with different splicing frequencies,” PLoS ONE 4(4): e5387.

[16] E. Kim, A. Goren, and G. Ast, “Alternative splicing: current perspectives,” BioEssays, 2008, vol. 30, pp. 38-47.