Classification and Regression Trees (CART) for Predictive Modeling in Blended Learning

Full Text (PDF, 494KB), PP.1-9

Views: 0 Downloads: 0


Nick Z. Zacharis 1,*

1. Department of Computer Systems Engineering, Piraeus University of Applied Sciences, 12244, Greece

* Corresponding author.


Received: 3 Oct. 2017 / Revised: 10 Nov. 2017 / Accepted: 27 Nov. 2017 / Published: 8 Mar. 2018

Index Terms

Education Data Mining, Student Data, Blended learning, Decision Trees, CART algorithm, Moodle


Today, Internet and Web technologies not only provide students opportunities for flexible interactivity with study materials, peers and instructors, but also generate large amounts of usage data that can be processed and reveal behavioral patterns of study and learning. This study analyzed data extracted from a Moodle-based blended learning course, to build a student model that predicts course performance. CART decision tree algorithm was used to classify students and predict those at risk, based on the impact of four online activities: message exchanging, group wiki content creation, course files opening and online quiz taking. The overall percentage of correct classifications was about 99.1%, proving the model sensitive to identify very specific groups at risk.

Cite This Paper

Nick Z. Zacharis, "Classification and Regression Trees (CART) for Predictive Modeling in Blended Learning", International Journal of Intelligent Systems and Applications(IJISA), Vol.10, No.3, pp.1-9, 2018. DOI:10.5815/ijisa.2018.03.01


[1]Hu, S. & Kuh, G. (2002). Being disengaged in educationally purposeful activities: the Influences of student and institutional characteristics. Research in Higher Education, 43(5), 555-575.
[2]Coates, H. (2006). Student engagement in campus-based and online education: University connections. Abingdon, UK: Routledge.
[3]Mostow, J. & Beck, J. (2006). Some useful tactics to modify, map and mine data from intelligent tutors. Natural Language Engineering, 12(2), 195-208.
[4]Siemens, G. (2013). Learning analytics: The emergence of a discipline. American Behavioral Scientist, 57(10), 1380–1400.
[5]Shahiri, A.M., Husain, W., Rashid, N.A. (2015). A Review on Predicting Student's Performance Using Data Mining Techniques, Procedia Computer Science, 72, 414–422.
[6]Baradwajdan, B. & Pal, S. (2011). Mining Educational Data to Analyze Students’ Performance, International Journal of Advanced Computer Science and Applications, 2 (6), 63–69.
[7]Adhatrao K, Gaykar A, Dhawan A, Jha R, Honrao V. (2013). Predicting Students’ Performance using ID3 and C4.5 classification algorithms, International Journal of Data Mining & Knowledge Management Process, 3 (5), 39–52.
[8]Jiang, S, Williams, AE, Schenke, K, Warschauer, M, & O’Dowd, D. (2014). Predicting MOOC performance with week 1 behavior. In Proceedings of the 7th International Conference on Educational Data Mining (pp. 273–275).
[9]Wang, R., Ryu, H., & Katuk, N. (2015). Assessment of Students’ Cognitive-Affective States in learning Within A Computer-Based Environment: Effects on Performance. Journal of Information & Communication Technology,14, 153–176
[10]Liyanage, M., Gunawardena K., Hirakawa, M. (2016). Detecting Learning Styles in Learning Management Systems Using Data Mining. JIP 24(4), 740–749.
[11]Kabakchieva, D. (2013). Predicting Student Performance by Using Data Mining Methods for Classification, Cybernetics and Information Technologies, 13(1), 61–72.
[12]Bogarín, A., Romero, C., Cerezo, R. and Sánchez-Santillán, M. (2014). Clustering for improving educational process mining. In Proceedings of the Fourth International Conference on Learning Analytics And Knowledge. ACM, New York, NY, USA, 11–15.
[13]Zacharis, N. Z. (2015). A multivariate approach to predicting student outcomes in web-enabled blended learning courses. Internet and Higher Education, 27, 44–53.
[14]Zacharis, N. Z. (2016). Predicting student academic performance in blended learning using Artificial Neural Networks. International Journal of Artificial Intelligence and Applications, 7(5), 17–29.
[15]Nissen, E., & Tea, E. (2012). Going blended: New challenges for second generation L2 tutors. Computer Assisted Language Learning, 25(2), 145–163.
[16]Donnelly, R. (2010). Harmonizing technology with interaction in blended problem-based learning. Computers & Education, 54, 350–359.
[17]Peine A, Kabino K, Spreckelsen C. (2016). Self-directed learning can outperform direct instruction in the course of a modern German medical curriculum - results of a mixed methods trial. BMC Med Educ. 16:158.
[18]Hahessy, S., Burke, E., Byrne, E., Farrelly, F., Kelly, M., Mooney, B., & Meskell, P. (2014). Indicators of student satisfaction in postgraduate blended learning programmes: Key messages from a survey study. AISHE-J: The All Ireland Journal of Teaching and Learning in Higher Education, 6(3), 1941–1957.
[19]Shahiri, A. M., Husain, W., & Rashid, N. A. (2015). A review on predicting student’s performance using data mining techniques. In 3rd Information Systems International Conference, Vol. 72, pp. 414–422. Shenzhen: Elsevier.
[20]Batware, J. B. (2007). Real-time Data Mining for E-learning. Tshwane University of Technology.
[21]Larose, D. T., & Larose, C. D. (2015). Data mining and predictive analytics (1st ed.). John Wiley & Sons, Inc.
[22]Papamitsiou,Z., & Economides, A. (2014). Learning Analytics and Educational Data Mining in Practice: A Systematic Literature Review of Empirical Evidence. Educational Technology & Society, 17(4), 49–64.
[23]Chrysafiadi, K., & Virvou, M. (2013). Student modeling approaches: A literature review for the last decade. Expert Systems with Applications, 40(11), 4715–4729.
[24]Sison, R, & Shimura, M. (1998). Student modeling and machine learning. International Journal of Artificial Intelligence in Education, 9, 128–158.
[25]Aggarwal, C., C. (2015). Data Mining: The Textbook, Springer.
[26]Kumar, A and Kumar, S. (2017). Density Based Initialization Method for K-Means Clustering Algorithm, International Journal of Intelligent Systems and Applications (IJISA), 9(10), 40–48.
[27]Kotsiantis, S. B.. Kanellopoulos. D.. & Pintelas. P. E. (2006). Data preprocessing for supervised learning. International Journal of Computational Scienc, 1(2). 111–117.
[28]Kaur, Harjot & Verma, Prince. (2017). Comparative Weka Analysis of Clustering Algorithm‘s. International Journal of Information Technology and Computer Science, 9(8), 56-67.
[29]Pooja Thakar, Anil Mehta, Manisha. (2017). A Unified Model of Clustering and Classification to Improve Students’ Employability Prediction. International Journal of Intelligent Systems and Applications, 9(9),10–18.
[30]Rokach, L. and Maimon, O. (2007). Data Mining with Decision Trees: Theory and Applications. World Scientific.
[31]Quinlan, J.R. (1986), Induction of decision trees, Machine Learning,1, 81-106.
[32]Mitchell, T. M. (1997). Machine Learning. New York: WCB/McGraw-Hill.
[33]Breiman L., Friedman J., Olshen R., and Stone C.. Classification and Regression Trees. Wadsworth Int. Group, 1984.