Utilizing Random Forest and XGBoost Data Mining Algorithms for Anticipating Students’ Academic Performance

PDF (930KB), PP.29-44

Views: 0 Downloads: 0


Mukesh Kumar 1,* Navneet Singh 1 Jessica Wadhwa 1 Palak Singh 1 Girish Kumar 1 Ahmed Qtaishat 2

1. School of Computer Application, Lovely Professional University-Phagwara, Punjab, 144001, India

2. Department of Information Technology, Sohar University, Sohar, Sultanate of Oman

* Corresponding author.

DOI: https://doi.org/10.5815/ijmecs.2024.02.03

Received: 7 Apr. 2023 / Revised: 25 Jun. 2023 / Accepted: 12 Aug. 2023 / Published: 8 Apr. 2024

Index Terms

Educational Data Mining, Classification Algorithm, Exploratory Data Analysis, Random Forest Classifier, XGBoost Classifier, Predictive Accuracy


The growing field of educational data mining seeks to analyse educational data in order to develop models for improving education and the effectiveness of educational institutions. Educational data mining is utilised to develop novel approaches for extracting information from educational databases, enabling improved decision-making within the educational system. The main objective of this research paper is to investigate recent advancements in data mining techniques within the field of educational research, while also analysing the methodologies employed by previous researchers in this area. The predictive capabilities of various machine learning algorithms, namely Logistic Regression, Gaussian Naive Bayes, Support Vector Machine, Random Forest, K-Nearest Neighbour, and XGBoost Classifier, were evaluated and compared for their effectiveness in determining students' academic performance. The utilisation of Random Forest and XGBoost classifiers in analysing scholastic, behavioural, and additional student features has demonstrated superior accuracy compared to other algorithms. The training and testing of these classification models achieved an impressive accuracy rate of approximately (96.46% & 87.50%) and (95.05% & 84.38%), respectively. Employing this technique can provide educators with valuable insights into students' motivations and behaviours, ultimately leading to more effective instruction and reduced student failure rates. Students' achievements significantly influence the delivery of education.

Cite This Paper

Mukesh Kumar, Navneet Singh, Jessica Wadhwa, Palak Singh, Girish Kumar, Ahmed Qtaishat, "Utilizing Random Forest and XGBoost Data Mining Algorithms for Anticipating Students’ Academic Performance", International Journal of Modern Education and Computer Science(IJMECS), Vol.16, No.2, pp. 29-44, 2024. DOI:10.5815/ijmecs.2024.02.03


[1]Feng, G., Fan, M., & Chen, Y. (2022). Analysis and prediction of students’ academic performance based on educational data mining. IEEE Access, 10, 19558-19571.
[2]Rahman, M. M., Watanobe, Y., Matsumoto, T., Kiran, R. U., & Nakamura, K. (2022). Educational data mining to support programming learning using problem-solving data. IEEE Access, 10, 26186-26202.
[3]Batool, S., Rashid, J., Nisar, M. W., Kim, J., Kwon, H. Y., & Hussain, A. (2023). Educational data mining to predict students' academic performance: A survey study. Education and Information Technologies, 28(1), 905-971.
[4]Shafiq, D. A., Marjani, M., Habeeb, R. A. A., & Asirvatham, D. (2022). Student Retention Using Educational Data Mining and Predictive Analytics: A Systematic Literature Review. IEEE Access.
[5]Meghji, A. F., Mahoto, N. A., Asiri, Y., Alshahrani, H., Sulaiman, A., & Shaikh, A. (2023). Early detection of student degree-level academic performance using educational data mining. PeerJ Computer Science, 9, e1294.
[6]Ahuja, R., Jha, A., Maurya, R., & Srivastava, R. (2019). Analysis of educational data mining. In Harmony Search and Nature Inspired Optimization Algorithms: Theory and Applications, ICHSA 2018 (pp. 897-907). Springer Singapore.
[7]Silva, C., & Fonseca, J. (2017). Educational data mining: a literature review. Europe and MENA Cooperation advances in information and communication technologies, 87-94.
[8]Shahiri, A. M., & Husain, W. (2015). A review on predicting student's performance using data mining techniques. Procedia Computer Science, 72, 414-422.
[9]Amrieh, E. A., Hamtini, T., & Aljarah, I. (2016). Mining educational data to predict student’s academic performance using ensemble methods. International journal of database theory and application, 9(8), 119-136.
[10]Adebayo, A. O., & Chaubey, M. S. (2019). Data mining classification techniques on the analysis of student’s performance. GSJ, 7(4), 45-52.
[11]Amazona, M. V., & Hernandez, A. A. (2019). Modelling student performance using data mining techniques: Inputs for academic program development. In Proceedings of the 2019 5th International Conference on Computing and Data Engineering (pp. 36-40).
[12]Ashraf, M., Zaman, M., & Ahmed, M. (2020). An intelligent prediction system for educational data mining based on ensemble and filtering approaches. Procedia Computer Science, 167, 1471-1483.
[13]Moreno-Marcos, P. M., Pong, T. C., Munoz-Merino, P. J., & Kloos, C. D. (2020). Analysis of the factors influencing learners’ performance prediction with learning analytics. IEEE Access, 8, 5264-5282.
[14]Wakelam, E., Jefferies, A., Davey, N., & Sun, Y. (2020). The potential for student performance prediction in small cohorts with minimal available attributes. British Journal of Educational Technology, 51(2), 347-370.
[15]Ragab, A. H. M., Noaman, A. Y., Al-Ghamdi, A. S., & Madbouly, A. I. (2014). A comparative analysis of classification algorithms for students’ college enrollment approval using data mining. In Proceedings of the 2014 Workshop on Interaction Design in Educational Environments (pp. 106-113).
[16]Pandey, U. K., & Pal, S. (2011). Data Mining: A prediction of performer or underperformer using classification. arXiv preprint arXiv:1104.4163.
[17]Khan, S., & Alqahtani, S. (2020). Big Data Application and its Impact on Education. International Journal of Emerging Technologies in Learning (IJET), 15(17), pp. 36-46.
[18]Khan, A., & Ghosh, S. K. (2016). Analysing the impact of poor teaching on student performance. In 2016 IEEE international conference on teaching, assessment, and learning for engineering (TALE) (pp. 169-175). IEEE.
[19]Fong, S., & Biuk-Aghai, R. P. (2009). An automated university admission recommender system for secondary school students. In The 6th international conference on information technology and applications (p. 42).
[20]Francis, B. K., & Babu, S. S. (2019). Predicting academic performance of students using a hybrid data mining approach. Journal of medical systems, 43, 1-15.
[21]Nidhi, Kumar, M., Handa, D., & Agarwal, S. (2022, October). Student’s academic performance prediction by using ensemble techniques. In AIP Conference Proceedings (Vol. 2555, No. 1, p. 050004). AIP Publishing LLC.
[22]Soni, A., Kumar, V., Kaur, R., & Hemavathi, D. (2018). Predicting student performance using data mining techniques. International Journal of Pure and Applied Mathematics, 119(12), 221-227.
[23]Student Performance Data Set. Available online: https://archive.ics.uci.edu/ml/datasets/student+performance (accessed on 10 March 2023).
[24]Oyedotun, O. K., Tackie, S. N., Olaniyi, E. O., & Khashman, A. (2015). Data mining of students’ performance: Turkish students as a case study. International Journal of Intelligent Systems and Applications, 7(9), 20-27.
[25]Kumar, M., Mehta, G., Nayar, N., & Sharma, A. (2021). EMT: Ensemble meta-based tree model for predicting student performance in academics. In IOP Conference Series: Materials Science and Engineering (Vol. 1022, No. 1, p. 012062). IOP Publishing.