An Optimized Machine Learning Approach for Predicting Parkinson's Disease

Full Text (PDF, 518KB), PP.68-74

Views: 0 Downloads: 0


Mousumy Kundu 1,* Md Asif Nashiry 1 Atish Kumar Dipongkor 1 Shauli Sarmin Sumi 1 Md. Alam Hossain 1

1. Department of Computer Science and Engineering, Jashore University of Science and Technology, Jashore, 7408, Bangladesh

* Corresponding author.


Received: 25 Jul. 2020 / Revised: 26 Aug. 2020 / Accepted: 25 Oct. 2020 / Published: 8 Aug. 2021

Index Terms

Parkinson's disease (PD), voice recording data, machine learning models, normalization, hyperparameter tuning


Parkinson's disease (PD) is an age-related neurodegenerative disorder affecting millions of elderly people world-wide. The early and accurate diagnosis of PD with available treatment might delay neurodegeneration and prevent disabilities. The existing diagnosis method such as brain scan is an expensive process. The use of speech recognition with machine learning technologies for the diagnosis of PD patients could be less expensive. In this work, we have worked with the voice recorded dataset from UCI machine learning repository. Several studies were performed to identify PD patients from the healthy individuals by using voice recorded data with machine learning algorithms. In this paper, we have proposed an optimized approach of data pre-processing that enhances prediction accuracy for diagnosing PD. We obtain 97.4% prediction accuracy with higher sensitivity, specificity, precision, F1 score and kappa value by using AdaBoost. These improved performance evaluation metrics indicate, the use of voice recording with our optimised machine learning approach is highly reliable in prediction of PD. This approach may have significant implications for early stage diagnosis of PD in a cost-effective manner.

Cite This Paper

Mousumy Kundu, Md Asif Nashiry, Atish Kumar Dipongkor, Shauli Sarmin Sumi, Md. Alam Hossain, " An Optimized Machine Learning Approach for Predicting Parkinson's Disease ", International Journal of Modern Education and Computer Science(IJMECS), Vol.13, No.4, pp. 68-74, 2021.DOI: 10.5815/ijmecs.2021.04.06


[1] J. Jankovic, "Parkinson’s disease: clinical features and diagnosis," Journal of neurology, neurosurgery & psychiatry, vol. 79, no. 4, pp. 368-376, 2008.

[2] S. Sveinbjornsdottir, "The clinical symptoms of Parkinson's disease," Journal of neurochemistry, vol. 139, pp. 318-324, 2016.

[3] A. W. Michell, S. J. G. Lewis, T. Foltynie, and R. A. Barker, "Biomarkers and Parkinson's disease," Brain, vol. 127, no. 8, pp. 1693-1705, 2004.

[4] Z. Bosnić and I. Kononenko, "An overview of advances in reliability estimation of individual predictions in machine learning," Intelligent Data Analysis, vol. 13, no. 2, pp. 385-401, 2009.

[5] O. S. S. Alsharif, K. M. Elbayoudi, A. A. S. Aldrawi, and K. Akyol, "Evaluation of Different Machine Learning Methods for Caesarean Data Classification," International Journal of Information Engineering and Electronic Business, vol. 11, no. 5, p. 19, 2019.

[6] C. O. Sakar and O. Kursun, "Telediagnosis of Parkinson’s disease using measurements of dysphonia," Journal of medical systems, vol. 34, no. 4, pp. 591-599, 2010.

[7] i. Cantürk and F. Karabiber, "A machine learning system for the diagnosis of Parkinson’s disease from speech signals and its application to multiple speech signal types," Arabian Journal for Science and Engineering, vol. 41, no. 12, pp. 5049-5059, 2016.

[8] A. Dinesh and J. He, "Using machine learning to diagnose Parkinson's disease from voice recordings," 2017: IEEE, pp. 1-4.

[9] A. Abós et al., "Discriminating cognitive status in Parkinson’s disease through functional connectomics and machine learning," Scientific reports, vol. 7, p. 45347, 2017.

[10] L. Berus, S. Klancnik, M. Brezocnik, and M. Ficko, "Classifying Parkinson’s Disease Based on Acoustic Measures Using Artificial Neural Networks," Sensors, vol. 19, no. 1, p. 16, 2019.

[11] P. Schratz, J. Muenchow, E. Iturritxa, J. Richter, and A. Brenning, "Hyperparameter tuning and performance assessment of statistical and machine-learning algorithms using spatial data," Ecological Modelling, vol. 406, pp. 109-120, 2019.

[12] M. Rahman, Y. Zhou, S. Wang, and J. Rogers, "Wart Treatment Decision Support Using Support Vector Machine," 2020.

[13] M. Little, P. McSharry, E. Hunter, J. Spielman, and L. Ramig, "Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease," Nature Precedings, pp. 1-1, 2008.

[14] R. C. Team, "R: A language and environment for statistical computing," 2013.

[15] M. Kuhn, "A Short Introduction to the caret Package," R Found Stat Comput, pp. 1-10, 2015.

[16] R. Islam, A. Satter, A. K. Dipongkor, M. S. Siddik, and K. Sakib, "A Novel Approach for Converting N-Dimensional Dataset into Two Dimensions to Improve Accuracy in Software Defect Prediction."

[17] A. Pandey and A. Jain, "Comparative analysis of KNN algorithm using various normalization techniques," International Journal of Computer Network and Information Security, vol. 9, no. 11, p. 36, 2017.

[18] J. H. Friedman, "Multivariate adaptive regression splines," The annals of statistics, pp. 1-67, 1991.

[19] C. Cortes and V. Vapnik, "Support-vector networks," Machine learning, vol. 20, no. 3, pp. 273-297, 1995.

[20] L. Breiman, "Random forests," Machine learning, vol. 45, no. 1, pp. 5-32, 2001.

[21] T. Chen and C. Guestrin, "Xgboost: A scalable tree boosting system," 2016, pp. 785-794.

[22] Y. Freund and R. E. Schapire, "A desicion-theoretic generalization of on-line learning and an application to boosting," 1995: Springer, pp. 23-37.

[23] S.-H. Teng, "Scalable algorithms for data and network analysis," Foundations and Trends® in Theoretical Computer Science, vol. 12, no. 1–2, pp. 1-274, 2016.