Madhavi S. Pednekar; Kaustubh Bhattacharyya

Stacking Based Ensemble Learning with Deer Hunting Optimization for Automatic Identification of Malvani Dialects

PDF (2020KB), PP.109-132

Views: 0 Downloads: 0

Author(s)

Madhavi S. Pednekar ^1,* Kaustubh Bhattacharyya ²

1. Department of EXTC, Don Bosco Institute of Technology, Mumbai - 400070, Maharashtra, India

2. Department of ECE, School of Technology, Assam Don Bosco University, Guwahati, Assam - 781017, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijigsp.2025.06.07

Received: 11 Jan. 2025 / Revised: 11 May 2025 / Accepted: 16 Aug. 2025 / Published: 8 Dec. 2025

Index Terms

Dialect Identification, Audio Signal, Fractional Bandpass Filter, Gammatone Frequency Cepstral Coefficients, Shifted Delta Cepstral Coefficient, Deer Hunting Optimization

Abstract

Language Identification (LID) is a subset of Dialect Identification that addresses specific challenges and matters related to linguistic similarity between dialects. Various current approaches are used for dialect identification, but automated prediction is difficult because the clarity of voices is not in a perfect range, and inaccurate selection of features. It is essential to utilize an appropriate feature subset that contains sufficient signal information for the learning model to correctly recognize language dialects. So as to eradicate the mentioned issues, optimized stacking based ensemble learning is developed. The identification process initiates with the pre-processing by using an adaptive least mean square filter and a fractional bandpass filter. The features from the pre-processed audio signal will be extracted by using Gammatone frequency Cepstral coefficients (GFCC) and Shifted Delta Cepstral Coefficient (SDCC). Then, the extracted features will be reduced with the help of Independent Component Analysis (ICA). Furtherly, the classification of selected features will be further given to the Recurrent Neural Network (RNN), which acts as a meta-classifier and additionally gets information from a pair of distinct classifiers, such as Radial Basis Functional Neural Network (RBFNN) and Deep Belief Network (DBN). The hyperparameter present in the RNN classifier was tuned using the Deer Hunting Optimization Algorithm (DHOA). The proposed approach has an accuracy of 97%, a precision of 96%, also an F1-score of 97%. Therefore, for an automatic dialect identification, the suggested approach is the best option.

Cite This Paper

Madhavi S. Pednekar, Kaustubh Bhattacharyya, "Stacking Based Ensemble Learning with Deer Hunting Optimization for Automatic Identification of Malvani Dialects", International Journal of Image, Graphics and Signal Processing(IJIGSP), Vol.17, No.6, pp. 109-132, 2025. DOI:10.5815/ijigsp.2025.06.07

Reference

[1]L. Juvela, B. Bollepalli, J. Yamagishi, and P. Alku. “GELP: GAN-excited linear prediction for speech synthesis from mel-spectrogram,” arXiv preprint arXiv:1904.03976. 2019.
[2]M. Hämäläinen, K. Alnajjar, N. Partanen, and J. Rueter. “Finnish dialect identification: The effect of audio and text,” arXiv preprint arXiv:2111.03800. 2021.
[3]R. Patil, B. Narkhede, S. Gaonkar, and T. Dave. “Deep Learning Based Marathi Sentence Recognition using Devnagari Character Identification,” In 2023 International Conference on Communication System, Computing and IT Applications (CSCITA). pp. 10-15, 2023. IEEE.
[4]M.S. Pramod. “A Critical Edition and English Translation of Manuscript Panchikarana (Doctoral dissertation, “Rajiv Gandhi University of Health Sciences (India)). 2020.
[5]S. Gaikwad, T. Ranasinghe, M. Zampieri, and C.M. Homan. “Cross-lingual offensive language identification for low resource languages: The case of Marathi,” arXiv preprint arXiv:2109.03552. 2021.
[6]K. Lounnas, H. Satori, M. Hamidi, H. Teffahi, M. Abbas, and M. Lichouri. “CLIASR: a combined automatic speech recognition and language identification system,” In 2020 1st international conference on innovative research in applied science, engineering and Technology (IRASET). pp. 1-5, 2020. IEEE.
[7]N. B. Chittaragi, and S. G. Koolagudi. “Dialect identification using chroma-spectral shape features with ensemble technique,” Computer Speech & Language. vol. 70, pp. 101230, 2021.
[8]S. Karthick, “Semi Supervised Hierarchy Forest Clustering and KNN Based Metric Learning Technique for Machine Learning System,” Journal of Advanced Research in Dynamical and Control Systems, vol. 9, pp. 2679-2690, 2017.
[9]S. Shivaprasad, and M. Sadanandam. “Identification of regional dialects of Telugu language using text independent speech processing models,” International Journal of Speech Technology. vol. 23, pp. 251-258, 2020.
[10]N. B. Abdallah, S. Kchaou, and F. Bougares. “Text and speech-based tunisian Arabic sub-dialects identification,” In Proceedings of The 12th Language Resources and Evaluation Conference. pp. 6405-6411, 2020.
[11]J. Monteiro, M. J. Alam, and T. Falk. “On the performance of time-pooling strategies for end-to-end spoken language identification,” In Proceedings of the Twelfth Language Resources and Evaluation Conference. pp. 3566-3572, 2020.
[12]S. Shon, A. Pasad, F. Wu, P. Brusco, Y. Artzi, K. Livescu, and K. J. Han. “Slue: New benchmark tasks for spoken language understanding evaluation on natural speech,” In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 7927-7931, 2022. IEEE.
[13]R. Rajalakshmi, F. Mattins, S. Srivarshan, and L.P. Reddy. “Hate Speech and Offensive Content Identification in Hindi and Marathi Language Tweets using Ensemble Techniques,” In CEUR Workshop Proceedings. pp. 1-11, 2021.
[14]Mustaqeem, and S. Kwon. “A CNN-assisted enhanced audio signal processing for speech emotion recognition,” Sensors. vol. 20, no. 1, pp. 183, 2019.
[15]M. Sarma, and K.K. Sarma. “Dialect identification from Assamese speech using prosodic features and a neuro fuzzy classifier,” In 2016 3rd International Conference on Signal Processing and Integrated Networks (SPIN). pp. 127-132, 2016, IEEE.
[16]Z. Ma. “Speech processing with deep learning for voice-based respiratory diagnosis: a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Albany, New Zealand (Doctoral dissertation, Massey University),” 2022.
[17]S. Bhosale Rajkumar. “A Holistic Review of Automatic Speech Recognition Systems for Real-time Implementation,” Mathematical Statistician and Engineering Applications. vol. 71, no. 4, pp. 12341-12359, 2022.
[18]B. Bharathi. “SSNCSE_NLP@ DravidianLangTech-EACL2021: Offensive language identification on multilingual code mixing text,” In Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages. pp. 313-318, 2021.
[19]A. Babhulgaonkar, and S. Sonavane. “Language identification for multilingual machine translation,” In 2020 International Conference on Communication and Signal Processing (ICCSP). pp. 401-405, 2020. IEEE.
[20]R. Martinek, J. Rzidky, R. Jaros, P. Bilik, and M. Ladrova. “Least mean squares and recursive least squares algorithms for total harmonic distortion reduction using shunt active power filter control,” Energies. vol. 12, no. 8, pp. 1545, 2019.
[21]K. Biswal, S. Swain, M. C. Tripathy, and S. K. Kar. Modeling and performance improvement of fractional-order band-pass filter using fractional elements. IETE Journal of Research. vol. 69, no. 5, pp. 2791-2800, 2023.
[22]J. Basu, S. Khan, R. Roy, T. K. Basu, and S. Majumder. “Multilingual speech corpus in low-resource eastern and northeastern Indian languages for speaker and language identification,” Circuits, Systems, and Signal Processing. vol. 40, pp. 4986-5013, 2021.
[23]D. IRIMIA, and E. C. BOBRIC. APLICATION OF INDEPENDENT COMPONENT ANALYSIS IN LOAD PROFILE STUDY.
[24]T. T. Khoei, M. C. Labuhn, T. D. Caleb, W. C. Hu, and N. Kaabouch. “A stacking-based ensemble learning model with genetic algorithm for detecting early stages of Alzheimer’s disease,” In 2021 IEEE International Conference on Electro Information Technology (EIT). Pp. 215-222, 2021, IEEE.
[25]M. H. Bhatti, J. Khan, M. U. G. Khan, R. Iqbal, M. Aloqaily, Y. Jararweh, and B. Gupta, “Soft computing-based EEG classification by optimal feature selection and neural networks,” IEEE Transactions on Industrial Informatics. vol. 15, no. 10, pp. 5747-5754, 2019.
[26]H. Zhao, J. Liu, H. Chen, J. Chen, Y. Li, J. Xu, and W. Deng, “Intelligent diagnosis using continuous wavelet transform and gauss convolutional deep belief network,” IEEE Transactions on Reliability, 2022.
[27]E. Messner, M. Fediuk, P. Swatek, S. Scheidl, F. M. Smolle-Jüttner, H. Olschewski, and F. Pernkopf. “Multi-channel lung sound classification with convolutional recurrent neural networks,” Computers in Biology and Medicine. vol. 122, pp. 103831, 2020.
[28]W. Ha, and Z. Vahedi. “Automatic breast tumor diagnosis in MRI based on a hybrid CNN and feature-based method using improved deer hunting optimization algorithm,” Computational Intelligence and Neuroscience, 2021.
[29]Kukana, P., Sharma, P., & Bhardwaj, N. (2024). Optimized featured swarm convolutional neural network (OFSCNN) model based dialect recognition system for Bagri Rajasthani language. International Journal of Information Technology, 1-13.
[30]Ibrahim, A. B., Seddiq, Y. M., Meftah, A. H., Alghamdi, M., Selouani, S. A., Qamhan, M. A., ... & Alshebeili, S. A. (2020). Optimizing arabic speech distinctive phonetic features and phoneme recognition using genetic algorithm. IEEE Access, 8, 200395-200411.
[31]Kumar, G., & Bhardwaj, S. (2025). Biomimetic Computing for Efficient Spoken Language Identification. Biomimetics, 10(5), 316.
[32]Pednekar, M. S., & Bhattacharyya, K. (2024). Automatic identification of Malvani dialects from audio signal based on hybrid FFO-TSO with deep neural network. Multimedia Tools and Applications, 1-33.
[33]Barfungpa, S. P., Samantaray, L., Sarma, H. K. D., Panda, R., & Abraham, A. (2023). Dt-SNE: Predicting heart disease based on hyper parameter tuned MLP. Biomedical Signal Processing and Control, 86, 105129.
[34]Wijonarko, P., & Zahra, A. (2022). Spoken language identification on 4 Indonesian local languages using deep learning. Bulletin of Electrical Engineering and Informatics, 11(6), 3288-3293.
[35]Muttaqi, M., Degirmenci, A., & Karal, O. (2022, September). US accent recognition using machine learning methods. In 2022 Innovations in Intelligent Systems and Applications Conference (ASYU) (pp. 1-6). IEEE.

International Journal of Image, Graphics and Signal Processing (IJIGSP)