A Dataset for Speech Recognition to Support Arabic Phoneme Pronunciation

Full Text (PDF, 612KB), PP.31-38

Views: 0 Downloads: 0


Moner N. M. Arafa 1,* Reda Elbarougy 2 A. A. Ewees 1 G. M. Behery 2

1. Faculty of Specific Education, Damietta University, Egypt

2. Faculty of Science, Damietta University, Egypt

* Corresponding author.

DOI: https://doi.org/10.5815/ijigsp.2018.04.04

Received: 17 Oct. 2017 / Revised: 10 Jan. 2018 / Accepted: 15 Feb. 2018 / Published: 8 Apr. 2018

Index Terms

Phoneme pronunciation, Arabic phoneme, Dataset, Arabic Dataset, Reading difficulties, Arabic phoneme pronunciation, feature extraction, MFCC


It is difficult for some children to pronounce some phonemes such as vowels. In order to improve their pronunciation, this can be done by a human being such as teacher or parents. However, it is difficult to discover the error in the pronunciation without talking with each student individually. With a large number of students in classes nowadays, it is difficult for teachers to communicate with students separately. Therefore, this study proposes an automatic speech recognition system which has the capacity to detect the incorrect phoneme pronunciation. This system can automatically support children to improve their pronunciation by directly asking children to pronounce a phoneme and the system can tell them if it is correct or not. In the future, the system can give them the correct pronunciation and let them practise until they get the correct pronunciation. In order to construct this system, an experiment was done to collect the speech database. In this experiment 89, elementary school children were asked to produce 28 Arabic phonemes 10 times. The collected database contains 890 utterances for each phoneme. For each utterance, fundamental frequency f0, the first 4 formants are extracted and 13 MFCC co-efficients were extracted for each frame of the speech signal. Then 7 statics were applied for each signal. These statics are (max, min, range, mean, mead, variance and standard divination) therefore for each utterance to have 91 features. The second step is to evaluate if the phoneme is correctly pronounced or not using human subjects. In addition, there are six classifiers applied to detect if the phoneme is correctly pronounced or not by using the extracted acoustic features. The experimental results reveal that the proposed method is effective for detecting the miss pronounced phoneme ("أ").

Cite This Paper

Moner N. M. Arafa, Reda Elbarougy, A. A. Ewees, G. M. Behery," A Dataset for Speech Recognition to Support Arabic Phoneme Pronunciation", International Journal of Image, Graphics and Signal Processing(IJIGSP), Vol.10, No.4, pp. 31-38, 2018. DOI: 10.5815/ijigsp.2018.04.04


[1]C. J. Nereveettil, M. Kalamani, and S. Valarmathy, “Feature Selection Algorithm for Automatic Speech Recognition Based on Fuzzy Logic,” pp. 6974–6980, 2014.

[2]M. M. A. Awadalla, “Automatic recognition of Arabic spoken language,” Mansoura University., 2006.

[3]S. D. Shenouda, “Study of an arabic connectionist speech recognition system,” Mansoura University., 2006.

[4]M. Forsberg, “Why is Speech Recognition Difficult?” Technology, pp. 1–10, 2003.

[5]A. M. Ahmad, “Development of an intelligent agent for speech recognition and translation,” 2006.

[6]S. Theodoridis and S. Theodoridis, Introduction to pattern recognition? a MATLAB approach. Academic Press, 2010.

[7]M. F. Abdelaal and EL-Wakdy, “Speech Recognition Using a Wavelet Transform,” 2008.

[8]D. MANDALIA and P. GARETA, “Speaker Recognition Using MFCC and Vector Quantization Model,” Electronics, vol. Program, C, no. May, p. 75, 2011.

[9]M. Al Hawamdeh, “Loud Reading Errors of Third Grade Students in Irbid Governorate and their Relationship to Some Variables,” 2010.

[10]Ahmed, Abdelrahman, Yasser Hifny, Khaled Shaalan, and Sergio Toral. "Lexicon Free Arabic Speech Recognition Recipe." In International Conference on Advanced Intelligent Systems and Informatics, pp. 147-159. Springer International Publishing, 2016.

[11]Ewees, A. A., Mohamed Eisa, and M. M. Refaat. "Comparison of cosine similarity and k-NN for automated essays scoring." cognitive processing 3, no. 12 (2014).

[12]Reafat, M. M., A. A. Ewees, M. M. Eisa, and A. Ab Sallam. "Automated assessment of students arabic free-text answers." Int J Cooperative Inform Syst 12 (2012): 213-222.

[13]Menacer, Mohamed, Odile Mella, Dominique Fohr, Denis Jouvet, David Langlois, and Kamel Smaili. "An enhanced automatic speech recognition system for Arabic." In The third Arabic Natural Language Processing Workshop-EACL 2017. 2017.

[14]A. E.-R. S. A. El-Rahman, “Computer Aided Pronunciation Learning for Arabic Language,” 2007.

[15]T. A. F. I. Sheisha, “Building A speech recognition system for spoken Arabic,” Cairo University - Institute of Statistical Studies and Research, 2009.

[16]H. S. A. Abdelaziz, “Language speech impairment rehabilitation using automatic speech recognition (ASR) technique,” Cairo University - Faculty of Engineering, 2013.

[17]E. M. M. Essa, “Arabic speech recognition,” 2008.

[18]N. A.-S. B. S. Ahmed, “Distributed Speech Recognition of Arabic Speech over GSM Channels,” Cairo University., 2010.

[19]S. Mohanty and B. K. Swain, “Speaker Identification using SVM during Oriya Speech Recognition,” Int. J. Image, Graph. Signal Process., vol. 7, pp. 28–36, 2015.

[20]A. Pahwa and G. Aggarwal, “Speech Feature Extraction for Gender Recognition,” Int. J. Image, Graph. Signal Process., vol. 8, pp. 17–25, 2016.

[21]M. Ahmed, P. C. Shill, K. Islam, and M. A. H. Akhand, “Acoustic Modeling of Bangla Words using Deep Belief Network,” Int. J. Image, Graph. Signal Process., vol. 7, pp. 19–27, 2015.

[22]Vimala.‎ C. and V. Radha, “Efficient Acoustic Front-End Processing for Tamil Speech Recognition using Modified GFCC Features,” Int. J. Image, Graph. Signal Process., vol. 8, pp. 22–31, 2016.

[23]M. K. noby Khalil, “Development of a Cognitive Speech Recognition System to Improve the Pre-school Children speech Abilities a thesis,” 2011.

[24]M. A. Imtiaz and G. Raja, “Isolated word Automatic Speech Recognition (ASR) System using MFCC, DTW & KNN,” in 2016 Asia Pacific Conference on Multimedia and Broadcasting (APMediaCast), 2016, pp. 106–110.

[25]H. Y. F. M. Elghamrawy, “Improving Arabic phonemes recognition using nonlinear features,” Cairo University., 2013.

[26]Powers, D.M., Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation, 2011.