IJISA Vol. 17, No. 3, 8 Jun. 2025
Cover page and Table of Contents: PDF (size: 3838KB)
PDF (3838KB), PP.90-144
Views: 0 Downloads: 0
Diabetes Prediction, Machine Learning, XGBoost, K-NN Algorithm, Blood Glucose Monitoring, Intelligent System, Healthcare AI, Ensemble Methods, Risk Assessment, Pima Dataset
This paper presents the development and implementation of an intelligent system for predicting the risk of diabetes spread using machine learning techniques. The core of the system relies on the analysis of the Pima Indians Diabetes dataset through k-nearest neighbours (k-NN), Random Forest, Logistic Regression, Decision Trees and XGBoost algorithms. After pre-processing the data, including normalization and handling missing values, the k-NN model achieved an accuracy of 77.2%, precision of 80.0%, recall of 85.0%, F1-score of 83.0% and ROC of 81.9%. The Random Forest model achieved an accuracy of 81.0%, precision of 87.0%, recall of 91.0%, F1-score of 89.0% and ROC of 90.0%. The Logistic Regression model achieved an accuracy of 60.0%, precision of 93.0%, recall of 61.0%, F1-score of 74.0% and ROC of 69.0%. The Decision Trees model achieved an accuracy of 79.0%, precision of 87.0%, recall of 89.0%, F1-score of 88.0% and ROC of 83.0%. In comparison, the XGBoost model outperformed with an accuracy of 83.0%, precision of 85.0%, recall of 96.0%, F1-score of 90.0% and ROC of 91.0%, indicating strong prediction capabilities. The proposed system integrates both hardware (continuous glucose monitors) and software (AI-based classifiers) components, ensuring real-time blood glucose level tracking and early-stage diabetes risk prediction. The novelty lies in the proposed architecture of a distributed intelligent monitoring system and the use of ensemble learning for risk assessment. The results demonstrate the system's potential for proactive healthcare delivery and patient-centred diabetes management.
Dmytro Uhryn, Victoria Vysotska, Daryna Zadorozhna, Mariia Spodaryk, Kateryna Hazdiuk, Zhengbing Hu, "Intelligent Application for Predicting Diabetes Spread Risk in the World Based on Machine Learning", International Journal of Intelligent Systems and Applications(IJISA), Vol.17, No.3, pp.90-144, 2025. DOI:10.5815/ijisa.2025.03.06
[1]Global report on Diabetes by WHO (World Health Organization). URL: https://www.who.int/news-room/fact-sheets/detail/diabetes
[2]Diabetes risk factors. URL: https://www.cdc.gov/diabetes/basics/riskfactors.html
[3]Diabetes Basics by WHO (World Health Organization). URL: https://www.who.int/health-topics/diabetes
[4]opoviciu, M. S., Paduraru, L., Nutas, R. M., Ujoc, A. M., Yahya, G., Metwally, K., & Cavalu, S. (2023). Diabetes mellitus secondary to endocrine diseases: an update of diagnostic and treatment particularities. Int. J. Mol. Sci., 24(16), 12676.
[5]Serbis, A., Giapros, V., Kotanidou, E. P., Galli-Tsinopoulou, A., & Siomou, E. (2021). Diagnosis, treatment and prevention of type 2 diabetes mellitus in children and adolescents. World J. Diabetes, 12(4), 344.
[6]Farajollahi, M., & Baradaran, V. (2024). Expert system application in law: A review of research and applications. Int. J. Nonlinear Anal. Appl., 15(8), 107–114.
[7]Gautam, V., Trivedi, N. K., Singh, A., Mohamed, H. G., Noya, I. D., Kaur, P., & Goyal, N. (2022). A transfer learning-based artificial intelligence model for leaf disease assessment. Sustainability, 14(20), 13610.
[8]Agrawal, A., Gans, J., Goldfarb, A. (2019). The Economics of Artificial Intelligence: An Agenda. University of Chicago Press, pp. 197–236.
[9]Kim, J., Davis, T., & Hong, L. (2022). Augmented intelligence: enhancing human decision making. In: Bridging Human Intelligence and Artificial Intelligence, pp. 151–170. Springer, Cham.
[10]Jimma, B. L. (2023). Artificial intelligence in healthcare: A bibliometric analysis. Telemat. Inform. Rep., 9, 100041.
[11]Al Kuwaiti, A., et al. (2023). A review of the role of artificial intelligence in healthcare. J. Pers. Med., 13(6), 951.
[12]Richters, C., Stadler, M., Radkowitsch, A., Schmidmaier, R., Fischer, M. R., & Fischer, F. (2023). Who is on the right track? Behavior-based prediction of diagnostic success in a collaborative diagnostic reasoning simulation. Large-Scale Assess. Educ., 11(1), 3.
[13]Hu, Z., Uhryn, D., Ushenko, Y., Korolenko, V., Lytvyn, V., & Vysotska, V. (2024, January). System programming of a disease identification model based on medical images. In: Sixteenth Int. Conf. on Correlation Optics, Vol. 12938, pp. 59–62. SPIE.
[14]Eswari, T., Sampath, P., & Lavanya, S. J. P. C. S. (2015). Predictive methodology for diabetic data analysis in big data. Procedia Comput. Sci., 50, 203–208.
[15]Iyer, A., Jeyalatha, S., & Sumbaly, R. (2015). Diagnosis of diabetes using classification mining techniques. arXiv preprint arXiv:1502.03774.
[16]Rajesh, K., & Sangeetha, V. (2012). Application of data mining methods and techniques for diabetes diagnosis. Int. J. Eng. Innov. Technol., 2(3), 224–229.
[17]Kahramanli, H., & Allahverdi, N. (2008). Design of a hybrid system for the diabetes and heart diseases. Expert Syst. Appl., 35(1–2), 82–89.
[18]Patil, B. M., Joshi, R. C., & Toshniwal, D. (2010, February). Association rule for classification of type-2 diabetic patients. In: Proc. 2nd Int. Conf. on Machine Learning and Computing, pp. 330–334. IEEE.
[19]Butwall, M., & Kumar, S. (2015). A data mining approach for the diagnosis of diabetes mellitus using random forest classifier. Int. J. Comput. Appl., 120(8).
[20]Khan, D. M., & Mohamudally, N. (2011). An integration of K-means and decision tree (ID3) towards a more efficient data mining algorithm. J. Comput., 3(12), 76–82.
[21]Diabetes statistics. URL: https://www.who.int/diabetes/
[22]Holt, R. I. G., & Flyvbjerg, A. (Eds.) (2024). Textbook of Diabetes. John Wiley & Sons.
[23]Lansang, M. C., Leslie, R. D., Chowdhury, T. A., & Zhou, K. (2022). Diabetes: Clinician's Desk Reference. CRC Press.
[24]Lytvyn, V., Hryhorovych, A., Hryhorovych, V., Chyrun, L., Vysotska, V., & Bublyk, M. (2020, November). Medical content processing in intelligent system of district therapist. CEUR Workshop Proc., Vol. 2753, pp. 415–429.
[25]Lytvyn, V., et al. (2019). Methods and models of intellectual processing of texts for building ontologies of software for medical terms identification in content classification. CEUR Workshop Proc., Vol. 2488, pp. 354–368.
[26]Ahmed, N., et al. (2021). Machine learning based diabetes prediction and development of smart web application. Int. J. Cogn. Comput. Eng., 2, 229–241.
[27]Python Tutorial. URL: https://www.w3schools.com/python/default.asp
[28]Pandas documentation. URL: https://pandas.pydata.org/docs/index.html
[29]Sileo, D., & Moens, M.-F. (2022). Probing neural language models for understanding of words of estimative probability. arXiv preprint arXiv:2211.03358.
[30]Oh, D., Park, J. S., Kim, J. H., & Jang, G. J. (2021). Hierarchical phoneme classification for improved speech recognition. Appl. Sci., 11(1), 428.
[31]Preprocessing data. URL: https://scikit-learn.org/stable/modules/preprocessing.html
[32]API reference. URL: https://pandas.pydata.org/docs/reference/index.html
[33]Python-recsys on Github. URL: https://github.com/ocelma/python-recsys
[34]Preprocessing data. URL: https://scikit-learn.org/stable/modules/preprocessing.html
[35]Zhu, H., & Hwang, B. G. (2024). Development of a sensor-based safety performance analytic mobile system to detect, alert, and analyze workers’ unsafe behaviors. In: Computing in Civil Engineering 2023, pp. 476–482.
[36]Fazakis, N., et al. (2021). Machine learning tools for long-term type 2 diabetes risk prediction. IEEE Access, 9, 103737–103757.
[37]Butt, U. M., et al. (2021). Machine learning based diabetes classification and prediction for healthcare applications. J. Healthc. Eng., 2021(1), 9930985.
[38]El-Sofany, H., et al. (2024). A proposed technique using machine learning for the prediction of diabetes disease through a mobile app. Int. J. Intell. Syst., 2024(1), 6688934.
[39]Verma, N., Singh, S., & Prasad, D. (2022). Machine learning and IoT-based model for patient monitoring and early prediction of diabetes. Concurrency Comput., 34(24), e7219.
[40]Ramesh, J., Aburukba, R., & Sagahyroon, A. (2021). A remote healthcare monitoring framework for diabetes prediction using machine learning. Healthc. Technol. Lett., 8(3), 45–57.
[41]AlZu’bi, S., et al. (2023). Diabetes monitoring system in smart health cities based on big data intelligence. Future Internet, 15(2), 85.
[42]Ojugo, A. A., & Ekurume, E. O. (2021). Predictive intelligent decision support model in forecasting of the diabetes pandemic using a reinforcement deep learning approach. Int. J. Educ. Manag. Eng., 11(2), 40–48.
[43]Kumar, G. R., et al. (2023). Web application based diabetes prediction using machine learning. In: Proc. Int. Conf. on Advances in Computing, Communication and Applied Informatics (ACCAI). IEEE.
[44]Li, J., et al. (2021). Establishment of noninvasive diabetes risk prediction model based on tongue features and machine learning techniques. Int. J. Med. Inform., 149, 104429.
[45]Li, J., Izakian, H., Pedrycz, W., & Jamal, I. (2021). Clustering-based anomaly detection in multivariate time series data. Appl. Soft Comput., 100, 106919.
[46]Albahra, S., et al. (2023). Artificial intelligence and machine learning overview in pathology & laboratory medicine: A general review of data preprocessing and basic supervised concepts. Semin. Diagn. Pathol., 40(2).
[47]Bodyanskiy, Y., Popov, S., Brodetskyi, F., & Chala, O. (2022, November). Adaptive least-squares support vector machine and its combined learning-selflearning in image recognition task. In: Proc. 2022 IEEE 17th Int. Conf. on Computer Sciences and Information Technologies (CSIT), pp. 48–51. IEEE.
[48]El Jaafari, I., Ellahyani, A., & Charfi, S. (2021). Parametric rectified nonlinear unit (PRenu) for convolution neural networks. Signal Image Video Process., 15(2), 241–246.
[49]Adam, T. C., et al. (2021). Association of psychobehavioral variables with HOMA-IR and BMI differs for men and women with prediabetes in the PREVIEW lifestyle intervention. Diabetes Care, 44(7), 1491–1498.
[50]Fang, M., et al. (2021). Diabetes and the risk of hospitalisation for infection: the Atherosclerosis Risk in Communities (ARIC) study. Diabetologia, 64, 2458–2465.
[51]Chen, S., et al. (2022). Advancing prediction of risk of intraoperative massive blood transfusion in liver transplantation with machine learning models. A multicenter retrospective study. Front. Neuroinform., 16, 893452.
[52]Trocin, C., Mikalef, P., Papamitsiou, Z., & Conboy, K. (2023). Responsible AI for digital health: a synthesis and a research agenda. Inf. Syst. Front., 25(6), 2139–2157.