Prediction of Students‘ Performance in Introductory Programming in Higher Education

PDF (1029KB), PP.1-20

Views: 0 Downloads: 0

Author(s)

Joao P. J. Pires 1 Jorge F. R. Bernardino 1 Anabela J. Gomes 1 Ana Rosa P. Borges 1,* Fernanda M. R. Brito R. Correia 1

1. Coimbra Institute of Engineering, Polytechnic University of Coimbra, Portugal

* Corresponding author.

DOI: https://doi.org/10.5815/ijmecs.2026.01.01

Received: 25 Apr. 2025 / Revised: 26 Jun. 2025 / Accepted: 21 Oct. 2025 / Published: 8 Feb. 2026

Index Terms

Introductory Programming, Machine Learning, Prediction, Programming Cognitive Tests

Abstract

Analyzing student performance in Introductory Programming courses in Higher Education is crucial for early intervention and improved academic outcomes. This study investigates the predictive potential of a Programming Cognitive Test in assessing student aptitude and forecasting success in an Introductory Programming course. Data was collected from 180 students, both freshmen and repeating students, enrolled in a Computer Engineering program. The dataset includes the Programming Cognitive test results, background variables, and final course outcomes. To identify latent patterns within the data, the K-means clustering algorithm was applied, focusing particularly on freshmen students to avoid bias from prior programming exposure. In parallel, six Machine Learning classification models were developed and evaluated to predict students’ likelihood of passing the Introductory Programming course: Decision Tree, K-Nearest Neighbor, Naïve Bayes, Random Forest, Support Vector Machine, and Deep Neural Network. Among these, the Deep Neural Network model demonstrated superior performance, achieving the highest values across key metrics—Accuracy, Recall, and F1-score—effectively identifying students at risk of underperformance. These findings underscore the potential of this model in educational settings, where timely and accurate detection of struggling students can enable proactive, targeted interventions. 
This work contributes to the field by combining cognitive assessment with predictive modelling, offering a novel approach to forecasting programming performance. The models and methods described are adaptable for broader educational applications and may assist educators in refining teaching strategies and improving retention and success rates in programming education.

Cite This Paper

João P. J. Pires, Jorge F. R. Bernardino, Anabela J. Gomes, Ana Rosa P. Borges, Fernanda M. R. Brito R. Correia, "Prediction of Students’ Performance in Introductory Programming in Higher Education", International Journal of Modern Education and Computer Science(IJMECS), Vol.18, No.1, pp. 1-20, 2026. DOI:10.5815/ijmecs.2026.01.01

Reference

[1]A. Gomes and A. J. Mendes, “Learning to program-difficulties and solutions,” in International Conference on Engineering Education–ICEE, 2007, pp. 1–5.
[2]J. Bennedsen and M. E. Caspersen, “Abstraction ability as an indicator of success for learning object-oriented programming?,” ACM SIGCSE Bulletin, vol. 38, no. 2, 2006, doi: 10.1145/1138403.1138430.
[3]P. Byrne and G. Lyons, “The effect of student attributes on success in programming,” in Proceedings of the 6th annual conference on Innovation and technology in computer science education, 2001, pp. 49–52.
[4]A. Luxton-Reilly et al., “Introductory programming: a systematic literature review,” in Proceedings companion of the 23rd annual ACM conference on innovation and technology in computer science education, 2018, pp. 55–106.
[5]A. Gomes and A. Mendes, “A study on student’s characteristics and programming learning,” ED-MEDIA 2008--World Conference on Educational Multimedia, Hypermedia & Telecommunications, 2008.
[6]A. Gomes and A. Mendes, “A teacher’s view about introductory programming teaching and learning: Difficulties, strategies and motivations,” in Proceedings - Frontiers in Education Conference, FIE, 2014. doi: 10.1109/FIE.2014.7044086.
[7]E. Tomai and C. F. Reilly, “The impact of math preparedness on introductory programming (CS1) success (abstract only),” 2014. doi: 10.1145/2538862.2544292.
[8]A. Lishinski, A. Yadav, R. Enbody, and J. Good, “The influence of problem solving abilities on students’ performance on different assessment tasks in CS1,” SIGCSE 2016 - Proceedings of the 47th ACM Technical Symposium on Computing Science Education, pp. 329–334, Feb. 2016, doi: 10.1145/2839509.2844596.
[9]T. Jenkins, “On the difficulty of learning to program,” in Proceedings of the 3rd Annual Conference of the LTSN Centre for Information and Computer Sciences, Citeseer, 2002, pp. 53–58.
[10]Y. Qian and J. Lehman, “Students’ misconceptions and other difficulties in introductory programming: A literature review,” ACM Transactions on Computing Education (TOCE), vol. 18, no. 1, pp. 1–24, 2017.
[11]J. Bennedsen, M. E. Caspersen, and M. Kölling, Reflections on the teaching of programming: methods and implementations, vol. 4821. Springer, 2008.
[12]E. Lahtinen, K. Ala-Mutka, and H.-M. Järvinen, “A study of the difficulties of novice programmers,” ACM SIGCSE Bulletin, vol. 37, no. 3, pp. 14–18, Sep. 2005, doi: 10.1145/1151954.1067453.
[13]C. Watson and F. W. B. Li, “Failure rates in introductory programming revisited,” in ITICSE 2014 - Proceedings of the 2014 Innovation and Technology in Computer Science Education Conference, 2014. doi: 10.1145/2591708.2591749.
[14]J. Bennedsen and M. E. Caspersen, “Failure rates in introductory programming - 12 years later,” ACM Inroads, vol. 10, no. 2, 2019, doi: 10.1145/3324888.
[15]S. R. Sobral, “Strategies on Teaching Introducing to Programming in Higher Education,” in Advances in Intelligent Systems and Computing, 2021. doi: 10.1007/978-3-030-72660-7_14.
[16]P. C. Tavares, P. R. Henriques, and E. F. Gomes, “A computer platform to increase motivation in programming students-PEP,” in CSEDU 2017 - Proceedings of the 9th International Conference on Computer Supported Education, 2017. doi: 10.5220/0006287402840291.
[17]E. Verdú, L. M. Regueras, M. J. Verdú, J. P. Leal, J. P. De Castro, and R. Queirós, “A distributed system for learning programming on-line,” Comput Educ, vol. 58, no. 1, 2012, doi: 10.1016/j.compedu.2011.08.015.
[18]A. Ferreira, A. Gomes, and A. J. Mendes, “SICAS2: Interactive Tool to Support Programming Learning,” in SIIE 2022 - 24th International Symposium on Computers in Education, 2022. doi: 10.1109/SIIE56031.2022.9982323.
[19]A. Moreno, N. Myller, and E. Sutinen, “JeCo, a collaborative learning tool for programming,” in Proceedings - 2004 IEEE Symposium on Visual Languages and Human Centric Computing, 2004. doi: 10.1109/VLHCC.2004.33.
[20]M. M. McGill, “Learning to program with personal robots: Influences on student motivation,” ACM Transactions on Computing Education, vol. 12, no. 1, 2012, doi: 10.1145/2133797.2133801.
[21]R. Scherer, F. Siddiq, and B. S. Viveros, “The cognitive benefits of learning computer programming: A meta-analysis of transfer effects,” J Educ Psychol, vol. 111, no. 5, 2019, doi: 10.1037/edu0000314.
[22]J. Harris, “Testing programming aptitude in introductory programming courses,” J. Comput. Sci. Coll., vol. 30, no. 2, pp. 149–156, Dec. 2014.
[23]K. Quille, S. Nam Liao, E. Costelloe, K. Nolan, A. Mooney, and K. Shah, “PreSS: Predicting Student Success Early in CS1. A Pilot International Replication and Generalization Study,” in Proceedings of the 27th ACM Conference on on Innovation and Technology in Computer Science Education Vol. 1, in ITiCSE ’22. New York, NY, USA: Association for Computing Machinery, 2022, pp. 54–60. doi: 10.1145/3502718.3524755.
[24]J. Ringenberg, M. Lapp, A. Bansal, and P. Shah, “The programming performance prophecies: Predicting student achievement in a first-year introductory programming course,” Computers in Education Journal, vol. 22, no. 2, 2012, doi: 10.18260/1-2--18930.
[25]L. J. Mazlack, “Identifying potential to acquire programming skill,” Commun ACM, vol. 23, no. 1, pp. 14–17, Jan. 1980, doi: 10.1145/358808.358811.
[26]“Project Jupyter | Home.” Accessed: Jul. 18, 2024. [Online]. Available: https://jupyter.org/
[27]R. E. Mayer, J. L. Dyck, and W. Vilberg, “Learning to program and learning to think: What’s the connection?,” Commun ACM, vol. 29, no. 7, pp. 605–610, Jul. 1986, doi: 10.1145/6138.6142.
[28]J. M. Wolfe, “Wolfe programming aptitude test (school edition),” in Proceedings of the Ninth Annual SIGCPR Conference, in SIGCPR ’71. New York, NY, USA: Association for Computing Machinery, 1971, pp. 180–185. doi: 10.1145/800159.805105.
[29]D. F. Butcher and W. A. Muth, “Predicting performance in an introductory computer science course,” Commun. ACM, vol. 28, no. 3, pp. 263–268, Mar. 1985, doi: 10.1145/3166.3167.
[30]B. Winrow, “The Walden programmer analyst aptitude test,” Dr. Dobb’s Journal, Sep. 1999, Accessed: Apr. 16, 2025. [Online]. Available: https://drdobbs.com/the-walden-programmer-analyst-aptitude-t/184411169?queryText=winrow
[31]S. Dehnadi, “Testing Programming Aptitude.,” in 18th Workshop of the Psychology of Programming Interest Group, Brighton, UK, Sep. 2006, pp. 22–37. Accessed: Apr. 16, 2025. [Online]. Available: https://www.ppig.org/files/2006-PPIG-18th-dehnadi.pdf
[32]“IBM Verbal Questions | Verbal Ability Questions For IBM.” Accessed: Apr. 01, 2024. [Online]. Available: https://cpt.hitbullseye.com/IBM-Verbal-Questions.php
[33]“IBM Reasoning Questions | ReasoningTest For IBM.” Accessed: Apr. 01, 2024. [Online]. Available: https://cpt.hitbullseye.com/IBM-Reasoning-Test.php
[34]“IBM Aptitude Questions | Aptitude Test For IBM.” Accessed: Apr. 01, 2024. [Online]. Available: https://cpt.hitbullseye.com/IBM-Aptitude-Questions.php
[35]M. Ragni, I. Kola, and P. N. Johnson-Laird, “The Wason Selection Task: A Meta-Analysis,” in CogSci 2017 - Proceedings of the 39th Annual Meeting of the Cognitive Science Society: Computational Foundations of Cognition, 2017.
[36]J. S. B. T. Evans, “Deciding before you think: Relevance and reasoning in the selection task,” British Journal of Psychology, vol. 87, no. 2, 1996, doi: 10.1111/j.2044-8295.1996.tb02587.x.
[37]R. L. Thorndike, “Who Belongs in the Family?,” Psychometrika, vol. 18, no. 4, pp. 267–276, Dec. 1953, doi: 10.1007/BF02289263.
[38]P. J. Rousseeuw, “Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,” J Comput Appl Math, vol. 20, pp. 53–65, 1987, doi: https://doi.org/10.1016/0377-0427(87)90125-7.
[39]J. P. J. Pires, F. Brito Correia, A. Gomes, A. R. Borges, and J. Bernardino, “Predicting Student Performance in Introductory Programming Courses,” Computers, vol. 13, no. 9, 2024, doi: 10.3390/computers13090219.
[40]“Decision Tree - GeeksforGeeks.” Accessed: Jul. 05, 2024. [Online]. Available: https://www.geeksforgeeks.org/decision-tree/
[41]“K-Nearest Neighbor(KNN) Algorithm - GeeksforGeeks.” Accessed: Jul. 06, 2024. [Online]. Available: https://www.geeksforgeeks.org/k-nearest-neighbours/
[42]“Naive Bayes Classifiers - GeeksforGeeks.” Accessed: Jul. 07, 2024. [Online]. Available: https://www.geeksforgeeks.org/naive-bayes-classifiers/
[43]“Random Forest Algorithm in Machine Learning - GeeksforGeeks.” Accessed: Jul. 07, 2024. [Online]. Available: https://www.geeksforgeeks.org/random-forest-algorithm-in-machine-learning/
[44]“Support Vector Machine (SVM) Algorithm - GeeksforGeeks.” Accessed: Jul. 07, 2024. [Online]. Available: https://www.geeksforgeeks.org/support-vector-machine-algorithm/
[45]“What Is Deep Learning? | IBM.” Accessed: Jul. 06, 2024. [Online]. Available: https://www.ibm.com/topics/deep-learning
[46]G. Shen, S. Yang, Z. Huang, Y. Yu, and X. Li, “The prediction of programming performance using student profiles,” Educ Inf Technol (Dordr), vol. 28, no. 1, 2023, doi: 10.1007/s10639-022-11146-w.
[47]“SelectKBest — scikit-learn 1.5.2 documentation.” Accessed: Dec. 06, 2024. [Online]. Available: https://scikit-learn.org/1.5/modules/generated/sklearn.feature_selection.SelectKBest.html
[48]“Feature Selection in Python with Scikit-Learn - GeeksforGeeks.” Accessed: Dec. 07, 2024. [Online]. Available: https://www.geeksforgeeks.org/feature-selection-in-python-with-scikit-learn/
[49]“CATPCA - IBM Documentation.” Accessed: Apr. 16, 2025. [Online]. Available: https://www.ibm.com/docs/en/spss-statistics/saas?topic=reference-catpca
[50]“RandomOverSampler — Version 0.12.4.” Accessed: Dec. 06, 2024. [Online]. Available: https://imbalanced-learn.org/stable/references/generated/imblearn.over_sampling.RandomOverSampler.html
[51]“StandardScaler — scikit-learn 1.5.2 documentation.” Accessed: Dec. 06, 2024. [Online]. Available: https://scikit-learn.org/1.5/modules/generated/sklearn.preprocessing.StandardScaler.html
[52]“GridSearchCV — scikit-learn 1.5.2 documentation.” Accessed: Dec. 06, 2024. [Online]. Available: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html
[53]L. Breiman, “Random forests,” Mach Learn, vol. 45, no. 1, pp. 5–32, Oct. 2001, doi: 10.1023/A:1010933404324/METRICS.
[54]H. A. Abu Alfeilat et al., “Effects of Distance Measure Choice on K-Nearest Neighbor Classifier Performance: A Review,” Big Data, vol. 7, no. 4, pp. 221–248, Dec. 2019, doi: 10.1089/BIG.2018.0175.
[55]M. M. Jamjoom, E. A. Alabdulkareem, M. Hadjouni, F. K. Karim, and M. A. Qarh, “Early prediction for at-risk students in an introductory programming course based on student self-efficacy,” Informatica (Slovenia), vol. 45, no. 6, 2021, doi: 10.31449/INF.V45I6.3528.
[56]Sivasakthi M and Pandiyan M, “Machine Learning Algorithms to Predict Students’ Programming Performance: A comparative Study,” Journal of University of Shanghai for Science and Technology, vol. 24, 2022.
[57]M. Sivasakthi, “Classification and prediction based data mining algorithms to predict students’ introductory programming performance,” in Proceedings of the International Conference on Inventive Computing and Informatics, ICICI 2017, 2018. doi: 10.1109/ICICI.2017.8365371.
[58]I. Khan, A. Al Sadiri, A. R. Ahmad, and N. Jabeur, “Tracking student performance in introductory programming by means of machine learning,” in 2019 4th MEC International Conference on Big Data and Smart City, ICBDSC 2019, 2019. doi: 10.1109/ICBDSC.2019.8645608.
[59]A. Ahadi, R. Lister, H. Haapala, and A. Vihavainen, “Exploring machine learning methods to automatically identify students in need of assistance,” in ICER 2015 - Proceedings of the 2015 ACM Conference on International Computing Education Research, 2015. doi: 10.1145/2787622.2787717.
[60]A. Kumar Veerasamy, D. D’Souza, M. V. Apiola, M. J. Laakso, and T. Salakoski, “Using early assessment performance as early warning signs to identify at-risk students in programming courses,” in Proceedings - Frontiers in Education Conference, FIE, 2020. doi: 10.1109/FIE44824.2020.9274277.