Markov Models Applications in Natural Language Processing: A Survey

Full Text (PDF, 498KB), PP.1-16

Views: 0 Downloads: 0


Talal Almutiri 1,* Farrukh Nadeem 1

1. Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia

* Corresponding author.


Received: 9 Aug. 2021 / Revised: 9 Oct. 2021 / Accepted: 16 Nov. 2021 / Published: 8 Apr. 2022

Index Terms

Hidden Markov Models, Markov Chains, Named Entity Recognition, Natural Language Generation, Natural Language Processing, Parts of Speech Tagging, Quantitative Analysis


Markov models are one of the widely used techniques in machine learning to process natural language. Markov Chains and Hidden Markov Models are stochastic techniques employed for modeling systems that are dynamic and where the future state relies on the current state.  The Markov chain, which generates a sequence of words to create a complete sentence, is frequently used in generating natural language. The hidden Markov model is employed in named-entity recognition and the tagging of parts of speech, which tries to predict hidden tags based on observed words. This paper reviews Markov models' use in three applications of natural language processing (NLP): natural language generation, named-entity recognition, and parts of speech tagging. Nowadays, researchers try to reduce dependence on lexicon or annotation tasks in NLP. In this paper, we have focused on Markov Models as a stochastic approach to process NLP. A literature review was conducted to summarize research attempts with focusing on methods/techniques that used Markov Models to process NLP, their advantages, and disadvantages. Most NLP research studies apply supervised models with the improvement of using Markov models to decrease the dependency on annotation tasks. Some others employed unsupervised solutions for reducing dependence on a lexicon or labeled datasets.

Cite This Paper

Talal Almutiri, Farrukh Nadeem, "Markov Models Applications in Natural Language Processing: A Survey", International Journal of Information Technology and Computer Science(IJITCS), Vol.14, No.2, pp.1-16, 2022. DOI: 10.5815/ijitcs.2022.02.01


[1] H. M. Hapke, H. Lane, and C. Howard, “Natural language processing in action.” Manning, 2019.

[2] C. Manning and H. Schutze, Foundations of statistical natural language processing. MIT press, 1999.

[3] M. A. M. Bhuiyan, “Predicting stochastic volatility for extreme fluctuations in high frequency time series,” 2020.

[4] B. Render and R. M. Stair Jr, Quantitative Analysis for Management, 12e. Pearson Education India, 2016.

[5] P. N. Reddy and G. Acharyulu, Marketing research. Excel Books India, 2009.

[6] D. JURAFSKY and H. M. JAMES, “Speech and language processing. 3rd edn.,” Online: https://web. stanford. edu/~ juraf-sky/slp3, 2019.

[7] D. S. Myers, L. Wallin, and P. Wikström, “An introduction to Markov chains and their applications within finance.” Mathematical Sciences-Chalmers University of Technology and University of …, 2017.

[8] L. E. Baum and T. Petrie, “Statistical inference for probabilistic functions of finite state Markov chains,” Ann. Math. Stat., vol. 37, no. 6, pp. 1554–1563, 1966.

[9] L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proc. IEEE, vol. 77, no. 2, pp. 257–286, 1989.

[10] M. PIETRZYKOWSKI and W. SAŁABUN, “Applications of Hidden Markov Model: state-of-the-art,” Int. J. Comput. Technol. Appl., vol. 5, no. 4, pp. 1384–1391, 2014.

[11] G. D. Forney, “The viterbi algorithm,” Proc. IEEE, vol. 61, no. 3, pp. 268–278, 1973.

[12] V. Irechukwu, “Overview of The Hidden Markov Model (HMM)— What it can do for you in Machine Learning,” medium, 2018. (accessed Oct. 17, 2020).


[14] A. Chopra, A. Prashar, and C. Sain, “Natural language processing,” Int. J. Technol. Enhanc. Emerg. Eng. Res., vol. 1, no. 4, pp. 131–134, 2013.

[15] H. Zhang, H. Zhou, N. Miao, and L. Li, “Generating fluent adversarial examples for natural languages,” arXiv Prepr. arXiv2007.06174, 2020.

[16] E. Martínez Garcia, A. Nogales, J. Morales Escudero, and Á. J. Garcia-Tejedor, “A light method for data generation: a combi-nation of Markov Chains and Word Embeddings,” 2020.

[17] S. Gehrmann, S. Layne, and F. Dernoncourt, “Improving human text comprehension through semi-Markov CRF-based neural section title generation,” arXiv Prepr. arXiv1904.07142, 2019.

[18] B. Harrison, C. Purdy, and M. O. Riedl, “Toward Automated Story Generation with Markov Chain Monte Carlo Methods and Deep Neural Networks.,” 2017.

[19] Z. Yang, S. Jin, Y. Huang, Y. Zhang, and H. Li, “Automatically generate steganographic text based on Markov model and Huffman coding,” arXiv Prepr. arXiv1811.04720, 2018.

[20] Y. Luo, Y. Huang, F. Li, and C. Chang, “Text Steganography Based on Ci-poetry Generation Using Markov Chain Model.,” TIIS, vol. 10, no. 9, pp. 4568–4584, 2016.

[21] A. Miller, N. Markenzon, V. Embar, and L. Getoor, “Collective Bio-Entity Recognition in Scientific Documents using Hinge-Loss Markov Random Fields,” 2020.

[22] R. Arora, C.-T. Tsai, K. Tsereteli, P. Kambadur, and Y. Yang, “A semi-markov structured support vector machine model for high-precision named entity recognition,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pp. 5862–5866.

[23] K. K. Lay and A. Cho, “Myanmar Named Entity Recognition with Hidden Markov Model,” 2019.

[24] M. D. Drovo, M. Chowdhury, S. I. Uday, and A. K. Das, “Named Entity Recognition in Bengali Text Using Merged Hidden Markov Model and Rule Base Approach,” in 2019 7th International Conference on Smart Computing & Communications (ICSCC), 2019, pp. 1–5.

[25] M. K. Malik and S. M. Sarwar, “Urdu named entity recognition system using hidden Markov model,” Pakistan J. Eng. Appl. Sci., 2017.

[26] R. Leaman and Z. Lu, “TaggerOne: joint named entity recognition and normalization with semi-Markov Models,” Bioinformatics, vol. 32, no. 18, pp. 2839–2846, 2016.

[27] E. Azeraf, E. Monfrini, E. Vignon, and W. Pieczynski, “Hidden Markov Chains, Entropic Forward-Backward, and Part-Of-Speech Tagging,” arXiv Prepr. arXiv2005.10629, 2020.

[28] Y. A. Rohman and R. Kusumaningrum, “Twitter Storytelling Generator Using Latent Dirichlet Allocation and Hidden Markov Model POS-TAG (Part-of-Speech Tagging),” in 2019 3rd International Conference on Informatics and Computational Sciences (ICICoS), 2019, pp. 1–6.

[29] J. Assunção, P. Fernandes, and L. Lopes, “Language Independent POS-tagging Using Automatically Generated Markov Chains (S).,” in SEKE, 2019, pp. 513–666.

[30] A. Kadim and A. Lazrek, “Parallel HMM-based approach for arabic part of speech tagging.,” Int. Arab J. Inf. Technol., vol. 15, no. 2, pp. 341–351, 2018.

[31] U. Afini and C. Supriyanto, “Morphology analysis for Hidden Markov Model based Indonesian part-of-speech tagger,” in 2017 1st International Conference on Informatics and Computational Sciences (ICICoS), 2017, pp. 237–240.

[32] K. Stratos, M. Collins, and D. Hsu, “Unsupervised part-of-speech tagging with anchor hidden markov models,” Trans. Assoc. Comput. Linguist., vol. 4, pp. 245–257, 2016.

[33] Valentin Gazeau, Cihan Varol, "Automatic Spoken Language Recognition with Neural Networks", International Journal of Information Technology and Computer Science(IJITCS), Vol.10, No.8, pp.11-17, 2018. DOI: 10.5815/ijitcs.2018.08.02

[34] M. Saad, S. Aslam, W. Yousaf, M. Sehnan, S. Anwar, and D. Rehman, “Student Testing and Monitoring System (Stms) Using Nlp.,” Int. J. Mod. Educ. Comput. Sci., vol. 11, no. 9, 2019.

[35] T.-H. Wen, M. Gasic, N. Mrksic, P.-H. Su, D. Vandyke, and S. Young, “Semantically conditioned lstm-based natural language generation for spoken dialogue systems,” arXiv Prepr. arXiv1508.01745, 2015.

[36] D. Mirkovic and L. Cavedon, “Dialogue management using scripts.” Google Patents, Oct. 18, 2011.

[37] F. Mairesse and M. A. Walker, “Controlling user perceptions of linguistic style: Trainable generation of personality traits,” Comput. Linguist., vol. 37, no. 3, pp. 455–488, 2011.

[38] F. Mairesse and S. Young, “Stochastic language generation in dialogue using factored language models,” Comput. Linguist., vol. 40, no. 4, pp. 763–799, 2014.

[39] T.-H. Wen et al., “Stochastic language generation in dialogue using recurrent neural networks with convolutional sentence reranking,” arXiv Prepr. arXiv1508.01755, 2015.

[40] I. Sutskever, O. Vinyals, and Q. V Le, “Sequence to sequence learning with neural networks,” in Advances in neural information processing systems, 2014, pp. 3104–3112.

[41] Nabil Ibtehaz, Abdus Satter, "A Partial String Matching Approach for Named Entity Recognition in Unstructured Bengali Data", International Journal of Modern Education and Computer Science(IJMECS), Vol.10, No.1, pp. 36-45, 2018.DOI: 10.5815/ijmecs.2018.01.04

[42] V. Yadav and S. Bethard, “A survey on recent advances in named entity recognition from deep learning models,” arXiv Prepr. arXiv1910.11470, 2019.

[43] T. Eftimov, B. Koroušić Seljak, and P. Korošec, “A rule-based named-entity recognition method for knowledge extraction of evidence-based dietary recommendations,” PLoS One, vol. 12, no. 6, p. e0179488, 2017.

[44] S. H. Bach, M. Broecheler, B. Huang, and L. Getoor, “Hinge-loss markov random fields and probabilistic soft logic,” J. Mach. Learn. Res., vol. 18, no. 1, pp. 3846–3912, 2017.

[45] S. G. Kanakaraddi and S. S. Nandyal, “Survey on parts of speech tagger techniques,” in 2018 International Conference on Current Trends towards Converging Technologies (ICCTCT), pp. 1–6, 2018.

[46] D. Kumawat and V. Jain, “POS tagging approaches: A comparison,” Int. J. Comput. Appl., vol. 118, no. 6, 2015.