A Novel Text Representation Model to Categorize Text Documents using Convolution Neural Network

Full Text (PDF, 396KB), PP.36-45

Views: 0 Downloads: 0


M. B. Revanasiddappa 1,* B. S. Harish 1

1. Department of Information Science and Engineering, Sri Jayachamarajendra College of Engineering, Mysuru 570006, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2019.05.05

Received: 24 Mar. 2018 / Revised: 29 Apr. 2018 / Accepted: 23 May 2018 / Published: 8 May 2019

Index Terms

Text Documents, Convolution Neural Network, Representation, Feature Selection, Categorization


This paper presents a novel text representation model called Convolution Term Model (CTM) for effective text categorization. In the process of text categorization, representation plays a very primary role. The proposed CTM is based on Convolution Neural Network (CNN). The main advantage of proposed text representation model is that, it preserves semantic relationship and minimizes the feature extraction burden. In proposed model, initially convolution filter is applied on word embedding matrix. Since, the resultant CTM matrix is higher dimension, feature selection methods are applied to reduce the CTM feature space. Further, selected CTM features are fed into classifier to categorize text document. To discover the effectiveness of the proposed model, extensive experimentations are carried out on four standard benchmark datasets viz., 20-NewsGroups, Reuter-21758, Vehicle Wikipedia and 4 University datasets using five different classifiers. Accuracy is used to assess the performance of classifiers. The proposed model shows impressive results with all classifiers.

Cite This Paper

M. B. Revanasiddappa, B. S. Harish, "A Novel Text Representation Model to Categorize Text Documents using Convolution Neural Network", International Journal of Intelligent Systems and Applications(IJISA), Vol.11, No.5, pp.36-45, 2019. DOI:10.5815/ijisa.2019.05.05


[1]F. Sebastiani, “Machine learning in automated text categorization,” ACM computing surveys (CSUR), vol. 34, no. 1, pp. 1-47, 2002.
[2]F. S. Al-Anzi, and D. AbuZeina, “Beyond vector space model for hierarchical arabic text classification: A markov chain approach,” Information Processing & Management, vol. 54, no. 1, pp. 105-115, 2018.
[3]M. M. MiroŇĄczuk, and J. Protasiewicz, “A recent overview of the state-of-the-art elements of text classification,” Expert Systems with Applications, 2018.
[4]I. S. Abuhaiba, and H. M. Dawoud, “Combining Different Approaches to Improve Arabic Text Documents Classification,” International Journal of Intelligent Systems and Applications, vol. 9, no. 4, p.39, 2017.
[5]W. Wei, C. Guo, J. Chen, L. Tang, and L. Sun, “Ccodm: conditional cooccurrence degree matrix document representation method,” Soft Computing, pp. 1-17, 2017.
[6]Z. S. Harris, “Distributional structure,” Word, vol.10, no. 2-3, pp. 146-162, 1954.
[7]G. Salton, A. Wong, and C. S. Yang, “A vector space model for automatic indexing,” Communications of the ACM, vol. 18, no. 11, pp. 613-620, 1975.
[8]Y. H. Li, and A. K. Jain, “Classification of text documents,” The Computer Journal, vol. 41, no. 8, pp. 537-546, 1998.
[9]A. Hotho, A. Maedche, and S. Staab, “Ontology-based text document clustering,” KI, vol. 16, no. 4, pp. 48-54, 2002.
[10]W. Cavnar, “Using an n-gram-based document representation with a vector processing retrieval model,” NIST SPECIAL PUBLICATION SP, pp. 269-269, 1995.
[11]B. Choudhary, and P. Bhattacharyya, “Text clustering using universal networking language representation,” in: Proceedings of Eleventh International World Wide Web Conference, 2002.
[12]B. S. Harish, M. B. Revanasiddappa, and S. V. Arun Kumar, “Symbolic representation of text documents using multiple kernel fcm,” in: International Conference on Mining Intelligence and Knowledge Exploration, Springer, pp. 93-102, 2015.
[13]Q. Le, and T. Mikolov, “Distributed representations of sentences and documents,” in: International Conference on Machine Learning, pp. 1188-1196, 2014.
[14]M. Gupta, V. Varma, “Doc2sent2vec: A novel two-phase approach for learning document representation,” in: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, ACM, pp. 809-812, 2016.
[15]M. Keller, and S. Bengio, “A neural network for text representation,” in: International Conference on Artificial Neural Networks, Springer, pp. 667-672, 2005.
[16]Y. Li, B. Wei, Y. Liu, L. Yao, H. Chen, J. Yu, W. Zhu, “Incorporating knowledge into neural network for text representation,” Expert Systems with Applications, vol. 96 pp. 103-114, 2018.
[17]Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, “A neural probabilistic language model,” Journal of machine learning research, vol. 3, pp.1137-1155, 2003.
[18]X. Zhang, J. Zhao, and Y. LeCun, “Character-level convolutional networks for text classification,” in: Advances in neural information processing systems, pp. 649-657, 2015.
[19]A. Conneau, H. Schwenk, L. Barrault, and Y. Lecun, “Very deep convolutional networks for text classification,” in: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, vol. 1, pp. 1107-1116, 2017.
[20]Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, vol. 11, pp. 2278-2324, 1998.
[21]J. Huang, D. Ji, S. Yao, and W. Huang, “Character-aware convolutional neural networks for paraphrase identification,” in: International Conference on Neural Information Processing, Springer, pp. 177-184, 2016.
[22]A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, “Learning word vectors for sentiment analysis,” in: Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies-volume 1, Association for Computational Linguistics, pp. 142-150, 2011.
[23]R. Collobert, and J. Weston, “A unified architecture for natural language processing: Deep neural networks with multitask learning,” in: Proceedings of the 25th international conference on Machine learning, ACM, pp. 160-167, 2008.
[24]N. Neverova, C. Wolf, G. W. Taylor, and F. Nebout, “Multi-scale deep learning for gesture detection and localization,” in: Workshop at the European conference on computer vision, Springer, pp. 474-490, 2014.
[25]W. Huang, and J. Wang, “Character-level convolutional network for text classification applied to chinese corpus,” arXiv preprint arXiv:1611.04358.
[26]J. Gu, Z. Wang, J. Kuen, L. Ma, A. Shahroudy, B. Shuai, T. Liu, X. Wang, G. Wang, J. Cai, et al., “Recent advances in convolutional neural networks,” Pattern Recognition, 2017.
[27]A. K. Uysal, and S. Gunal, “The impact of preprocessing on text classification,” Information Processing & Management, vol. 50, no. 1, pp. 104-112, 2014.
[28]A. Rehman, K. Javed, and H. A. Babri, “Feature selection based on a normalized difference measure for text classification,” Information Processing & Management, vol. 53, no. 2, pp. 473-489, 2017.
[29]D. B. Patil, and Y. V. Dongre, “A Fuzzy Approach for Text Mining,” International journal of Mathematical Sciences and Computing, vol.4, pp.34-43, 2015.
[30]B. S. Harish, and M. B. Revanasiddappa, “A New Feature Selection Method based on Intuitionistic Fuzzy Entropy to Categorize Text Documents,” International Journal of Interactive Multimedia and Artificial Intelligence, vol.5, no. 3, pp. 106-117, 2018.
[31]B. S. Harish, and M. B. Revanasiddappa, “A comprehensive survey on various feature selection methods to categorize text documents,” International Journal of Computer Applications, vol. 164, no. 8, 2017.
[32]B. S. Harish, D. S. Guru, and S. Manjunath, “Representation and classification of text documents: A brief review,” IJCA, Special Issue on RTIPPR, no. 2, pp. 110-119, 2010.
[33]Y. Kim, “Convolutional neural networks for sentence classification,” arXiv preprint arXiv:1408.5882.
[34]R. Johnson, and T. Zhang, “Effective use of word order for text categorization with convolutional neural networks,” arXiv preprint arXiv:1412.1058.
[35]R. Johnson, and T. Zhang, “Semi-supervised convolutional neural networks for text categorization via region embedding,” in: Advances in neural information processing systems, pp. 919-927, 2015.
[36]X. Zhang, J. Zhao, and Y. LeCun, “Character-level convolutional networks for text classification,” in: Advances in neural information processing systems, pp. 649-657, 2015.
[37]Y. Zhang, I. Marshall, and B. C. Wallace, “Rationale-augmented convolutional neural networks for text classification,” in: Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing, NIH Public Access, pp. 795, 2016.
[38]L. Li, B. Qin, W. Ren, and T. Liu, “Document representation and feature combination for deceptive spam review detection,” Neurocomputing, vol. 254, pp. 33-41, 2017.
[39]J. Xu, B. Xu, P. Wang, S. Zheng, G. Tian, and J. Zhao, “Self-taught convolutional neural networks for short text clustering,” Neural Networks, vol. 88, pp. 22-31, 2017.
[40]C. Lee, and G. G. Lee, “Information gain and divergence-based feature selection for machine learning-based text categorization,” Information processing & management, vol. 42, no. 1, pp. 155-165, 2006.
[41]A. K. Uysal, and S. Gunal, “A novel probabilistic feature selection method for text classification,” Knowledge-Based Systems, vol. 36, pp. 226-235, 2012.
[42]D. M. Diab, K. M. and El Hindi, “Using differential evolution for fine tuning naive bayesian classifiers and its application for text classification,” Applied Soft Computing, vol. 54, pp. 183-199, 2017.
[43]Y. Ko, “How to use negative class information for naive bayes classification,” Information Processing & Management, vol. 53, no. 6, pp. 1255-1268, 2017.
[44]S. Jiang, G. Pang, M. Wu, and L. Kuang, “An improved k-nearest-neighbor algorithm for text categorization,” Expert Systems with Applications, vol. 39, no. 1, pp. 1503-1509, 2012.
[45]T. Joachims, “Text categorization with support vector machines: Learning with many relevant features,” in: European conference on machine learning, Springer, pp. 137-142, 1998.
[46]E. P. Jiang, “Semi-supervised text classification using rbf networks,” in: International Symposium on Intelligent Data Analysis, Springer, pp. 95-106, 2009.
[47]20newsgroups, http://people.csail.mit.edu/jrennie/20Newsgroups/.
[48]Reuters-21578, http://www.daviddlewis.com/resources/testcollections/reuters21578/.
[49]D. Isa, L. H. Lee, V. Kallimani, and R. Rajkumar, “Text document preprocessing with the bayes formula for classification using the support vector machine,” IEEE Transactions on Knowledge and Data engineering, vol. 20, no. 9, pp. 1264-1272, 2008.
[50]4-university, http://www.cs.cmu.edu/afs/cs/project/theo-20/www/data/