A Multi-channel Character Relationship Classification Model Based on Attention Mechanism

Full Text (PDF, 514KB), PP.28-36

Views: 0 Downloads: 0


Yuhao Zhao 1,* Hang Li 1 Shoulin Yin 1

1. Software College, Shenyang Normal University, Shenyang 110034, China

* Corresponding author.

DOI: https://doi.org/10.5815/ijmsc.2022.01.03

Received: 18 May 2021 / Revised: 16 Jun. 2021 / Accepted: 15 Jul. 2021 / Published: 8 Feb. 2022

Index Terms

Relation classification, attention mechanism, BERT, LSTM


Relation classification is an important semantic processing task in the field of natural language processing. The deep learning technology, which combines Convolutional Neural Network and Recurrent Neural Network with attention mechanism, has always been the mainstream and state-of-art method. The LSTM model based on recurrent neural network dynamically controls the weight by gating, which can better extract the context state information in time series and effectively solve the long-standing problem of recurrent neural network. The pre-trained model BERT has also achieved excellent results in many natural language processing tasks. This paper proposes a multi-channel character relationship classification model of BERT and LSTM based on attention mechanism. Through the attention mechanism, the semantic information of the two models is fused to get the final classification result. Using this model to process the text, we can extract and classify the relationship between the characters, and finally get the relationship between the characters included in this paper. Experimental results show that the proposed method performs better than the previous deep learning model on the SemEval-2010 task 8 dataset and the COAE-2016-Task3 dataset. 

Cite This Paper

Yuhao Zhao, Hang Li, Shoulin Yin," A Multi-channel Character Relationship Classification Model Based on Attention Mechanism ", International Journal of Mathematical Sciences and Computing(IJMSC), Vol.8, No.1, pp. 28-36, 2022. DOI: 10.5815/ijmsc.2022.01.03


[1]Ahmet Uyar, Farouk Musa Aliyu. "Evaluating search features of Google Knowledge Graph and Bing Satori" Online Information Review, 2015, 39(2).

[2]Wang L, Cao Z, De Melo G, et al. "Relation classification via multi-level attention CNNS" Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 1298-1307.

[3]Stephen Akuma,Rahat Iqbal. Development of Relevance Feedback System using Regression Predictive Model and TF-IDF Algorithm[J]. International Journal of Education and Management Engineering(IJEME),2018,8(4).

[4]Ashwani Kharola. Artificial Neural Networks based Approach for Predicting LVDT Output Characteristics[J]. International Journal of Engineering and Manufacturing(IJEM),2018,8(4).

[5]Shoulin Yin, Hang Li*, Desheng Liu and Shahid Karim. "Active Contour Model Based on Density-oriented BIRCH Clustering Method for Medical Image Segmentation" Multimedia Tools and Applications. Vol. 79, pp. 31049-31068, 2020.

[6]Xiaowei Wang, Shoulin Yin, Hang Li. "A Network Intrusion Detection Method Based on Deep Multi-scale Convolutional Neural Network." International Journal of Wireless Information Networks. 27(4), 503-517, 2020.

[7]Aone C, Ramos-Santacruz M. REES: A Large-Scale Relation and Event Extraction System[J]. proceedings of anlpnaacl, 2002.

[8]Humphrey Susanne M,Névéol Aurélie,Gobeil Julien,Ruch Patrick,Darmoni Stéfan J,Browne Allen. Comparing a Rule Based vs. Statistical System for Automatic Categorization of MEDLINE Documents According to Biomedical Specialty.[J]. Journal of the American Society for Information Science and Technology : JASIST,2009,60(12).

[9]Kambhatla N. "Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations" Proceedings of the ACL 2004 on Interactive poster and demonstration sessions. Association for Computational Linguistics, 2004:22.

[10]Qian L, Zhou G, Kong F, et al. "Exploiting constituent dependencies for tree kernel-based semantic relation extraction" Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1. Association for computational Linguistics, 2008:697-704.

[11]Mooney R J, Bunescu R C. "Subsequence kernels for relation extraction" Advances in neural information processing systems. 2005: 171-178.

[12]Bunescu R C, Mooney R J. "A shortest path dependency kernel for relation extraction" Proceedings of the conference on Human Language Technology and Empirical  Methods in Natural Language Processing. Association for Computational Linguistics, 2005: 724-731.

[13]Mintz M, Bills S, Snow R, et al. "Distant supervision for relation extraction without labeled data" Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2. Association for Computational Linguistics, 2009: 1003-1001.

[14]Riedel S, Yao L, McCallum A. "Modeling relations and their mentions without labeled text" Machine Learning and Knowledge Discovery in Databases. Springer Berlin Heidelberg, 2010: 148-163.

[15]Hoffman R, Zhang C, Ling X, et al. "Knowledge-based weak supervision for information extraction of overlapping relations" Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 2011: 541-550.

[16]Takamatsu S, Sato I, Nakagawa H. "Reducing wrong labels in distant supervision for relation extraction" Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1. Association for Computational Linguistics, 2012: 721-729.

[17]Li B, Zhao X, Wang S, et al. "Relation classification using revised convolutional neural networks" 4th International Conference on Systems and Informatics(ICSAI), 2017: 1438-1443.

[18]Santos C N, Xiang B, Zhou B. "Classifying relations by ranking with convolutional neural networks" Proceedings of the 53th Annual Meeting of the Association for Computational Linguistics(ACL), 2015: 626-634.

[19]Socher R, Huval B, Manning C D, et al. "Semantic compositionality through recursive matrix-vector spaces" Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Association for Computational Linguistics(EMNLP-CoNLL), 2012: 1201-1211.

[20]Guo X, Zhang H, Yang H, et al. "A single attention -based combination of CNN and RNN for relation classification." IEEE Access, 2019, 7(1): 12467-12475.

[21]Wang L, Cao Z, De Melo G, et al. "Relation classification via multi-level attention cnns" Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics(ACL), 2016: 1298-1307.

[22]Vaswani A, Shazeer N, Pamar N, et al. "Attention is all you need" Advances in Neural Information Processing Systems, 2017: 5998-6008.

[23]Zeng D, Liu K, Lai S, et al. "Relation classification via convolutional deep neural network" 25th International Conference on Computational Linguistics(COLING), 2014: 2335-2344.

[24]Zhou P, Shi W, Tian J, et al. "Attention-based bidirectional long short-term memory networks for relation classification" Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics(ACL), 2016: 207-212.

[25]Shen Y, Huang X. "Attention-based convolutional neural network for semantic relation extraction" Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers(COLING), 2016: 2526-2536.

[26]Meng B, Xu Bao-min, Zhou E, et al. "Bidirectional gated recurrent unit networks relation classification with multiple attentions and semantic information" The 16th International Symposium on Neural Network, 2019: 124-132.