Muh. Nurtanzis Sutoyo; Alders Paliling

Optimizing Student Performance Prediction via K-Means and k-NN Integration

PDF (701KB), PP.12-22

Views: 0 Downloads: 0

Author(s)

Muh. Nurtanzis Sutoyo ^1,* Alders Paliling ¹

1. Universitas Sembilanbelas November Kolaka, Kolaka, Indonesia

* Corresponding author.

DOI: https://doi.org/10.5815/ijeme.2025.04.02

Received: 28 Sep. 2024 / Revised: 5 Feb. 2025 / Accepted: 12 Mar. 2025 / Published: 8 Aug. 2025

Index Terms

Clustering Analysis, K-Means, k-NN, Learning Outcomes Mapping, Elbow Method

Abstract

This study explores the integration of two methods, namely K-Means and k-NN. K-means is used to identify categories of learning outcome data, while k-NN is used to predict students' learning outcomes into relevant categories. Through the calculation of the Elbow method, it was established that the optimal number of clusters for grouping is three. The learning outcome data, which include Arithmetic and Statistics scores, are processed to produce a mapping that differentiates students into three categories: Adequate, Moderate, and Good. In the 12th iteration, the clustering results using K-Means achieved convergence, with 64 students in the Adequate category (C1), 60 students in the Moderate category (C2), and 59 students in the Good category (C3). This indicates that the students in each group are evenly distributed based on their mathematical and statistical abilities. The prediction results using k-NN for a student with an Arithmetic score of 85 a Statistics score of 75, and a k-value of 61, found that 7 data fell into Category 1 (Adequate), 3 data into Category 2 (Moderate), and dominant 51 data in Category 3 (Good). Thus, the prediction results are placed in Category 3, indicating a 'Good' rating in their academic performance. By using data mining techniques to enhance understanding of student learning outcomes, this study provides a significant contribution to the field of education. It demonstrates substantial progress toward a data-driven learning approach that can be tailored to specific needs and improve student learning outcomes.

Cite This Paper

Muh. Nurtanzis Sutoyo, Alders Paliling, "Optimizing Student Performance Prediction via K-Means and k-NN Integration", International Journal of Education and Management Engineering (IJEME), Vol.15, No.4, pp. 12-22, 2025. DOI:10.5815/ijeme.2025.04.02

Reference

[1]T. M. Ghazal et al., “Performances of k-means clustering algorithm with different distance metrics,” Intell. Autom. Soft Comput., vol. 30, no. 2, pp. 735–742, 2021, doi: 10.32604/iasc.2021.019067.
[2]M. Cui, “Introduction to the K-Means Clustering Algorithm Based on the Elbow Method,” Accounting, Audit. Financ., vol. 1, pp. 5–8, 2020, doi: 10.23977/accaf.2020.010102.
[3]A. Jahwar, “Meta-Heuristic Algorithms for K-means Clustering: A Review,” PalArch’s J. Archaeol. Egypt/Egyptology, vol. 17, no. 7, pp. 7–9, 2021, [Online]. Available: https://archives.palarch.nl/index.php/jae/article/view/4630.
[4]K. K. Sharma and A. Seal, “Clustering analysis using an adaptive fused distance,” Eng. Appl. Artif. Intell., vol. 96, no. March, p. 103928, 2020, doi: 10.1016/j.engappai.2020.103928.
[5]J. J. Stankovic, I. Marjanovic, S. Drezgic, and Z. Popovic, “The digital competitiveness of european countries: A multiple-criteria approach,” J. Compet., vol. 13, no. 2, pp. 117–134, 2021, doi: 10.7441/JOC.2021.02.07.
[6]J. Pfitzinger and N. Katzke, “A Constrained Hierarchical Risk Parity Algorithm with Cluster-based Capital Allocation,” pp. 1–26, 2021.
[7]S. Balovsyak, O. Derevyanchuk, H. Kravchenko, Y. Ushenko, and Z. Hu, “Clustering Students According to their Academic Achievement Using Fuzzy Logic,” Int. J. Mod. Educ. Comput. Sci., vol. 15, no. 6, pp. 31–43, 2023.
[8]R. Andrian, M. A. Naufal, B. Hermanto, A. Junaidi, and F. R. Lumbanraja, “K-Nearest Neighbor (k-NN) Classification for Recognition of the Batik Lampung Motifs,” J. Phys. Conf. Ser., vol. 1338, no. 1, 2019, doi: 10.1088/1742-6596/1338/1/012061.
[9]S. Chimphlee and W. Chimphlee, “Machine learning to improve the performance of anomaly-based network intrusion detection in big data,” Indones. J. Electr. Eng. Comput. Sci., vol. 30, no. 2, pp. 1106–1119, 2023, doi: 10.11591/ijeecs.v30.i2.pp1106-1119.
[10]N. S. B. Mat Said, H. Madzin, S. K. Ali, and N. S. Beng, “Comparison of color-based feature extraction methods in banana leaf diseases classification using SVM and K-NN,” Indones. J. Electr. Eng. Comput. Sci., vol. 24, no. 3, pp. 1523–1533, 2021, doi: 10.11591/ijeecs.v24.i3.pp1523-1533.
[11]S. M. H. M. Huzir, N. Z. Mahabob, A. F. M. Amidon, N. Ismail, Z. M. Yusoff, and M. N. Taib, “A preliminary study on the intelligent model of k-nearest neighbor for agarwood oil quality grading,” Indones. J. Electr. Eng. Comput. Sci., vol. 27, no. 3, pp. 1358–1365, 2022, doi: 10.11591/ijeecs.v27.i3.pp1358-1365.
[12]H. F. El-Sofany and N. El-Haggar, “The effectiveness of using mobile learning techniques to improve learning outcomes in higher education,” Int. J. Interact. Mob. Technol., vol. 14, no. 8, pp. 4–18, 2020, doi: 10.3991/IJIM.V14I08.13125.
[13]H. M. Asim, A. Vaz, A. Ahmed, and S. Sadiq, “A Review on Outcome Based Education and Factors That Impact Student Learning Outcomes in Tertiary Education System,” Int. Educ. Stud., vol. 14, no. 2, p. 1, 2021, doi: 10.5539/ies.v14n2p1.
[14]A. Naim, “Applications of E-Learning Tools for Achieving Students Learning Outcomes,” J ournal Pedagog. Invent. Pract., vol. 2, no. 1, pp. 125–135, 2021.
[15]M. Usman, I. N. I, S. Utaya, and D. Kuswandi, “The Influence of JIGSAW Learning Model and Discovery Learning on Learning Discipline and Learning Outcomes,” Pegem Egit. ve Ogr. Derg., vol. 12, no. 2, pp. 166–178, 2022, doi: 10.47750/pegegog.12.02.17.
[16]M. Imron, U. Hasanah, and B. Humaidi, “Analysis of Data Mining Using K-Means Clustering Algorithm for Product Grouping,” IJIIS Int. J. Informatics Inf. Syst., vol. 3, no. 1, pp. 12–22, 2020, doi: 10.47738/ijiis.v3i1.3.
[17]R. T. Vulandari, W. L. Y. Saptomo, and D. W. Aditama, “Application of K-Means Clustering in Mapping of Central Java Crime Area,” Indones. J. Appl. Stat., vol. 3, no. 1, p. 38, 2020, doi: 10.13057/ijas.v3i1.40984.
[18]W. Nengsih and M. Mahrus Zain, “Descriptive Modeling Uses K-Means Clustering for Employee Presence Mapping,” Int. J. Inf. Eng. Electron. Bus., vol. 12, no. 4, pp. 15–20, 2020.
[19]M. Sh. Hajirahimova and A. S. Aliyeva, “Analyzing the Impact of Vaccination on COVID19 Confirmed Cases and Deaths in Azerbaijan Using Machine Learning Algorithm,” Int. J. Educ. Manag. Eng., vol. 12, no. 1, pp. 1–10, 2022.
[20]G. Y. Iskandarli, “Applying Clustering and Topic Modeling to Automatic Analysis of Citizens’ Comments in EGovernment,” Int. J. Inf. Technol. Comput. Sci., vol. 12, no. 6, pp. 1–10, 2020.
[21]M. T. Sathe and A. C. Adamuthe, “Comparative study of supervised algorithms for prediction of students’ performance,” Int. J. Mod. Educ. Comput. Sci., vol. 13, no. 1, pp. 1–21, 2021.
[22]N. Hidayati and A. Hermawan, “K-Nearest Neighbor (K-NN) algorithm with Euclidean and Manhattan in classification of student graduation,” J. Eng. Appl. Technol., vol. 2, no. 2, pp. 86–91, 2021, doi: 10.21831/jeatech.v2i2.42777.
[23]P. J. S. Ferreira, J. M. P. Cardoso, and J. Mendes-Moreira, “KNN prototyping schemes for embedded human activity recognition with online learning,” Computers, vol. 9, no. 4, pp. 1–20, 2020, doi: 10.3390/computers9040096.
[24]W. Liu, P. Wang, Y. Meng, C. Zhao, and Z. Zhang, “Cloud spot instance price prediction using kNN regression,” Human-centric Comput. Inf. Sci., vol. 10, no. 1, 2020, doi: 10.1186/s13673-020-00239-5.
[25]A. Priyadarshini, S. Mishra, D. P. Mishra, S. R. Salkuti, and R. Mohanty, “Fraudulent credit card transaction detection using soft computing techniques,” Indones. J. Electr. Eng. Comput. Sci., vol. 23, no. 3, pp. 1634–1642, 2021, doi: 10.11591/ijeecs.v23.i3.pp1634-1642.
[26]Fathoni, Erwin, and Abdiansah, “Multilabel sentiment analysis for classification of the spread of COVID-19 in Indonesia using machine learning,” Indones. J. Electr. Eng. Comput. Sci., vol. 31, no. 2, pp. 968–978, 2023, doi: 10.11591/ijeecs.v31.i2.pp968-978.
[27]S. Huang, Z. Kang, Z. Xu, and Q. Liu, “Robust deep k-means: An effective and simple method for data clustering,” Pattern Recognit., vol. 117, p. 107996, 2021, doi: 10.1016/j.patcog.2021.107996.
[28]P. Govender and V. Sivakumar, Application of K-Means and Hierarchical Clustering Techniques for Analysis of Air Pollution: A Review (1980–2019), vol. 11, no. 1. Turkish National Committee for Air Pollution Research and Control, 2020.
[29]A. Fahim, “Finding the Number of Clusters in Data and Better Initial Centers for K-means Algorithm,” Int. J. Intell. Syst. Appl., vol. 12, no. 6, pp. 1–20, 2020.
[30]A. Rafael Braga, D. G. Gomes, B. M. Freitas, and J. A. Cazier, “A cluster-classification method for accurate mining of seasonal honey bee patterns,” Ecol. Inform., vol. 59, p. 101107, 2020, doi: 10.1016/j.ecoinf.2020.101107.
[31]C. Shi, B. Wei, S. Wei, W. Wang, H. Liu, and J. Liu, “A quantitative discriminant method of elbow point for the optimal number of clusters in clustering algorithm,” Eurasip J. Wirel. Commun. Netw., vol. 2021, no. 1, 2021, doi: 10.1186/s13638-021-01910-w.
[32]F. Liu and Y. Deng, “Determine the Number of Unknown Targets in Open World Based on Elbow Method,” IEEE Trans. Fuzzy Syst., vol. 29, no. 5, pp. 986–995, 2021, doi: 10.1109/TFUZZ.2020.2966182.
[33]R. Sammouda and A. El-Zaart, “An Optimized Approach for Prostate Image Segmentation Using K-Means Clustering Algorithm with Elbow Method,” Comput. Intell. Neurosci., vol. 2021, 2021, doi: 10.1155/2021/4553832.
[34]Z. K. Maseer, R. Yusof, B. Al-Bander, A. Saif, and Q. K. Kadhim, “Meta-Analysis and Systematic Review for Anomaly Network Intrusion Detection Systems: Detection Methods, Dataset, Validation Methodology, and Challenges,” 2023.
[35]J. Salvador-Meneses, Z. Ruiz-Chavez, and J. Garcia-Rodriguez, “Compressed kNN: K-nearest neighbors with data compression,” Entropy, vol. 21, no. 3, pp. 1–20, 2019, doi: 10.3390/e21030234.
[36]A. Sagar, C. Vega, O. Bouriaud, C. Piedallu, and J. P. Renaud, “Multisource forest inventories: A model-based approach using k-NN to reconcile forest attributes statistics and map products,” ISPRS J. Photogramm. Remote Sens., vol. 192, no. August, pp. 175–188, 2022, doi: 10.1016/j.isprsjprs.2022.08.016.
[37]J. Hosseinzadeh, F. Masoodzadeh, and E. Roshandel, “Fault detection and classification in smart grids using augmented K-NN algorithm,” SN Appl. Sci., vol. 1, no. 12, 2019, doi: 10.1007/s42452-019-1672-0.

International Journal of Education and Management Engineering (IJEME)