IJIEEB Vol. 18, No. 2, 8 Apr. 2026
Cover page and Table of Contents: PDF (size: 1078KB)
PDF (1078KB), PP.192-204
Views: 0 Downloads: 0
Diabetes Prediction, Machine Learning Models, Support Vector Machine (SVM), Medical Data Classification, Classification Performance Metrics
Diabetes mellitus is a chronic metabolic disorder with a rapidly increasing global prevalence, posing a significant public health challenge. Early detection of diabetes can enable timely intervention and preventive measures, thereby reducing the risk of long-term complications. In this study, a machine learning (ML)-based methodology is proposed for the early prediction of diabetes mellitus. The proposed approach enhances existing prediction systems by improving key performance metrics, including precision, recall, and F1-score, and achieves an efficiency improvement of 4%–10% compared to state-of-the-art methods. Experimental results demonstrate that the support vector machine outperforms other ML algorithms for diabetes prediction, achieving 92% accuracy, 95% precision, 92% recall, 93% F1-score, 92% specificity, and an area under the receiver operating characteristic curve of 0.97.
Santanu Basak, Angshuman Khan, Mayank Raj, Abhishek Pandey, "Predicting Diabetes Using Machine Learning: Models and Insights", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.18, No.2, pp. 192-204, 2026. DOI:10.5815/ijieeb.2026.02.12
[1]U. Ahmed, G. F. Issa, M. A. Khan, S. Aftab, M. F. Khan, R. A. T. Said, T. M. Ghazal, and M. Ahmad. Prediction of Diabetes Empowered with Fused Machine Learning. IEEE Access. 10:8529-8538, 2022. DOI: 10.1109/ACCESS.2022.3142097.
[2]S. Basak, A. Khan, A. Gour, and K. Kumari. Secure Smart Healthcare System: An Approach Using Blockchain. In 2025 13th International Conference on Intelligent Embedded, MicroElectronics, Communication and Optical Networks (IEMECON), Jaipur, India, 2025, pp. 1-6. DOI: 10.1109/IEMECON69302.2025.11365674.
[3]M. N. Islam, S. N. Mustafina, T. Mahmud, and N. I. Khan. Machine learning to predict pregnancy outcomes: a systematic review, synthesizing framework and future research agenda. BMC Pregnancy Childbirth. 22:348,2022. DOI: 10.1186/s12884-022-04594-2.
[4]X. Zou, Y. Hu, Z. Tian, and K. Shen. Logistic Regression Model Optimization and Case Analysis. In 2019 IEEE 7th International Conference on Computer Science and Network Technology (ICCSNT), Dalian, China, 2019, pp. 135-139, 2019. DOI: 10.1109/ICCSNT47585.2019.8962457.
[5]Saloni, R. K. Sharma, and A. K. Gupta. Voice Analysis for Telediagnosis of Parkinson Disease Using Artificial Neural Networks and Support Vector Machines. International Journal of Intelligent Systems and Applications (IJISA). 7(6):41-47, 2015. DOI: 10.5815/ijisa.2015.06.04.
[6]J. R. Quinlan. Induction of decision trees. Machine Learning. 1:81–106, 1986. DOI: 10.1007/BF00116251.
[7]S. Murindanyi, J. Nakatumba-Nabende, R. Sanya, R. Nakibuule, and A. Katumba. Enhanced Infield Agriculture with Interpretable Machine Learning Approaches for Crop Classification. arXiv:2402.14389, 2024. DOI: 10.48550/arXiv.2408.12426.
[8]K. Taunk, S. De, S. Verma, and A. Swetapadma. A Brief Review of Nearest Neighbor Algorithm for Learning and Classification. In 2019 International Conference on Intelligent Computing and Control Systems (ICCS), Madurai, India, 2019, pp. 1255-1260, 2019. DOI: 10.1109/ICCS45141.2019.9065747.
[9]S. Basak and K. Chatterjee. Smart Healthcare Surveillance System Using IoT and Machine Learning Approaches for Heart Disease. In: Rajagopal, S., Faruki, P., Popat, K. (eds) Advancements in Smart Computing and Information Security. ASCIS 2022. Communications in Computer and Information Science, vol 1759. Springer, Cham. DOI: 10.1007/978-3-031-23092-9_24.
[10]S. Basak, K. Chatterjee, and A. Singh. DPPT: A differential privacy preservation technique for cyber–physical system. Computers and Electrical Engineering, 109:108661, 2023. DOI: 10.1016/j.compeleceng.2023.108661.
[11]E. I. Abd El-Latif and I. A. Moneim. Exploring Feature Selection and Machine Learning Algorithms for Predicting Diabetes Disease. International Journal of Intelligent Systems and Applications (IJISA). 16(1):1-10, 2024. DOI: 10.5815/ijisa.2024.01.01.
[12]P. Meenakshidevi, T. R. Logesh, G. Navayugan, and M. S. Kannan. Efficient Machine Learning Models for the Accurate Prediction of Diabetes. In 2024 International Conference on Science Technology Engineering and Management (ICSTEM), Coimbatore, India, 2024, pp. 1-5, 2024. DOI: 10.1109/ICSTEM61137.2024.10560652.
[13]K. Kumar and A. Tomar. Diabetes Prediction System Using Machine Learning. In 2023 International Conference on Advances in Computation, Communication and Information Technology (ICAICCIT), Faridabad, India, 2023, pp. 286-291,2023. DOI: 10.1109/ICAICCIT60255.2023.10466034.
[14]J. S, B. N, S. P, S. K. K, and V. M. Nageshwar. Diabetes Prediction Using Machine Learning Algorithms. In 2022 8th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 2022, pp.46-51, 2022. DOI: 10.1109/ICACCS54159.2022.9785073.
[15]N. T. N. N. Azamen, A. M. Ali, and N. A. A. Aziz. Prediction of Diabetic Retinopathy Based on Risk Factors Using Machine Learning Algorithms. In 2023 4th International Conference on Artificial Intelligence and Data Sciences (AiDAS), IPOH, Malaysia, 2023, pp. 308-312, 2023. DOI: 10.1109/AiDAS60501.2023.10284646.
[16]B. J. Lee and J. Y. Kim. Identification of Type 2 Diabetes Risk Factors Using Phenotypes Consisting of Anthropometry and Triglycerides based on Machine Learning. IEEE Journal of Biomedical and Health Informatics, 20(1):39-46, January 2016. DOI: 10.1109/JBHI.2015.2396520.
[17]S. R. P. Shetty and S. Joshi. A Tool for Diabetes Prediction and Monitoring Using Data Mining Technique. International Journal of Information Technology and Computer Science (IJITCS). 8(11):26-32, 2016. DOI: 10.5815/ijitcs.2016.11.04.
[18]I. O. Awoyelu, A. O. Ojewande, B. A. Kolawole, and T. M. Awoyelu. Prediction Models for Diabetes Mellitus Incidence. International Journal of Information Technology and Computer Science (IJITCS). 12(4):28-37, 2020. DOI: 10.5815/ijitcs.2020.04.04.
[19]K. Akyol and B. Şen. Diabetes Mellitus Data Classification by Cascading of Feature Selection Methods and Ensemble Learning Algorithms. International Journal of Modern Education and Computer Science (IJMECS). 10(6):10-16, 2018. DOI: 10.5815/ijmecs.2018.06.02.
[20]K. Zarkogianni, M. Athanasiou, A. C. Thanopoulou, and K. S. Nikita. Comparison of Machine Learning Approaches Toward Assessing the Risk of Developing Cardiovascular Disease as a Long-Term Diabetes Complication. IEEE Journal of Biomedical and Health Informatics. 22(5):1637-1647, September 2018. DOI: 10.1109/JBHI.2017.2765639.
[21]L. Zhang, Y. Wang, M. Niu, C. Wang, and Z. Wang. Nonlaboratory-Based Risk Assessment Model for Type 2 Diabetes Mellitus Screening in Chinese Rural Population: A Joint Bagging-Boosting Model. IEEE Journal of Biomedical and Health Informatics. 25(10):4005-4016, October 2021. DOI: 10.1109/JBHI.2021.3077114.
[22]R. Prasad, P. K. Shukla. Indeterminacy Handling of Adaptive Neuro-fuzzy Inference System Using Neutrosophic Set Theory: A Case Study for the Classification of Diabetes Mellitus. International Journal of Intelligent Systems and Applications (IJISA). 15(3):1-15, 2023. DOI: 10.5815/ijisa.2023.03.01.
[23]B. F. Wee, S. Sivakumar, K. H. Lim, W. K. Wong, F. H. Juwono. Diabetes detection based on machine learning and deep learning approaches. 83: 24153–24185, 2024. DOI: 10.1007/s11042-023-16407-5.
[24]N. Kim, D. Y. Lee, W. Seo, N. H. Kim, and S. -M. Park. Toward Personalized Hemoglobin A1c Estimation for Type 2 Diabetes. IEEE Sensors Journal. 22(23):23023-23032, 2022. DOI: 10.1109/JSEN.2022.3215004.
[25]A. Site, J. Nurmi, and E. S. Lohan. Machine-Learning-Based Diabetes Prediction Using Multisensor Data. IEEE Sensors Journal. 23(22):28370-28377, 2023. DOI: 10.1109/JSEN.2023.3319360.
[26]S. K. S. Modak and V. K. Jha. Diabetes prediction model using machine learning techniques. Multimedia Tools and Applications. 83:38523–38549, 2024. DOI: 10.1007/s11042-023-16745-4.
[27]K. Oliullah, M. H. Rasel, M. M. Islam, M. R. Islam, M. A. H. Wadud, and M. Whaiduzzaman. A stacked ensemble machine learning approach for the prediction of diabetes. Journal of Diabetes & Metabolic Disorders. 23:603–617, 2024. DOI: 10.1007/s40200-023-01321-2.
[28]N. Le, T. Pham, S. Nguyen, N. Nguyen, T. Nguyen. AI-powered Predictive Model for Stroke and Diabetes Diagnostic. International Journal of Intelligent Systems and Applications (IJISA). 16(1):24-40, 2024. DOI: 10.5815/ijisa.2024.01.03.
[29]G. Annuzzi, A. Apicella, P. Arpaia, L. Bozzetto, S. Criscuolo, E. D. Benedetto, M. Pesola, and R. Prevete. Exploring Nutritional Influence on Blood Glucose Forecasting for Type 1 Diabetes Using Explainable AI. IEEE Journal of Biomedical and Health Informatics. 28(5):3123-3133, May 2024. DOI: 10.1109/JBHI.2023.3348334.
[30]Ş. Kolozali, S. L. White, S. Norris, M. Fasli, and A. van Heerden. Explainable Early Prediction of Gestational Diabetes Biomarkers by Combining Medical Background and Wearable Devices: A Pilot Study with a Cohort Group in South Africa. IEEE Journal of Biomedical and Health Informatics. 28(4):1860-1871, April 2024. DOI: 10.1109/JBHI.2024.3361505.
[31]Early Stage Diabetes Risk Prediction. UCI Machine Learning Repository. 2020. [Online]. Available: DOI: 10.24432/C5VG8H.