Multimodal Assessment of Student Engagement by Fusing EEG, Facial Expressions, and Body Posture in an Offline Classroom

PDF (1009KB), PP.190-203

Views: 0 Downloads: 0

Author(s)

Min Song 1,* I Gusti Putu Sudiarta 2 Putu Kerti Nitiasih 3 Putu Nanci Riastini 4 Zhang Wang 5 Junyi Chai 5

1. Department of Educational Science, Ganesha University of Education (Undiksha), Bali, Indonesia

2. Department of Mathematics, Ganesha University of Education (Undiksha), Bali, Indonesia

3. Department of English Education, Ganesha University of Education (Undiksha), Bali, Indonesia

4. Department of Educational Science Program Study, Ganesha University of Education (Undiksha), Bali, Indonesia

5. Faculty of Information Engineering, College of Science and Technology Ningbo University, Ningbo, China

* Corresponding author.

DOI: https://doi.org/10.5815/ijmecs.2026.03.12

Received: 2 Feb. 2026 / Revised: 3 Mar. 2026 / Accepted: 28 Mar. 2026 / Published: 8 Jun. 2026

Index Terms

Student Engagement, multimodal fusion, MCA Fusion, EEG, Facial Expression Recognition, Body Posture Analysis

Abstract

An accurate and comprehensive assessment of student engagement in classrooms is crucial for enabling data-driven teaching and personalized education. Current approaches primarily rely on teacher observation or student self-reports, which are often subjective, delayed, and unable to capture cognitive engagement. To address these limitations, this study proposes a Multimodal Cognitive-Attention Fusion (MCA Fusion) framework, grounded in Fredricks’ three-dimensional engagement model.  The framework integrates electroencephalography (EEG), facial expressions, and body posture to simultaneously quantify cognitive, emotional, and behavioral engagement.  Built on a Transformer architecture, it employs self-attention to extract temporal features within each modality and introduces a cognition-guided cross-attention mechanism to dynamically integrate multimodal signals. To validate the framework, experiments were conducted with 36 undergraduate students in real classroom settings. The results demonstrate that our framework significantly outperforms all single-modality baselines, achieving an accuracy of 92% and an F1-score of 94.87%. Compared with the best single-modality model (EEG), the F1-score improves by 34.58 percentage points. Ablation studies further confirm the critical role of the cognitive modality (EEG) and the MCA Fusion mechanism, the removal of which leads to F1-score reductions of 62.58 and 56.16 percentage points, respectively. The proposed approach not only provides a theoretically informed and technically evaluated framework for engagement recognition but also provides a methodological foundation for future closed-loop “perception–assessment–feedback” systems in intelligent learning environments. 

Cite This Paper

Min Song, I Gusti Putu Sudiarta, Putu Kerti Nitiasih, Putu Nanci Riastini, Zhang Wang, Junyi Chai, "Multimodal Assessment of Student Engagement by Fusing EEG, Facial Expressions, and Body Posture in an Offline Classroom", International Journal of Modern Education and Computer Science(IJMECS), Vol.18, No.3, pp. 190-203, 2026. DOI:10.5815/ijmecs.2026.03.12

Reference

[1]E. Mangina and G. Psyrra, “Review of Learning Analytics and Educational Data Mining Applications,” EDULEARN21 Proc., vol. 1, no. August, pp. 949–954, 2021, doi: 10.21125/edulearn.2021.0250.
[2]V. T. Tran and N. H. Tran, “A Review of Smart Education and Lessons Learned for An Effective Application in Binh Duong Province, Vietnam,” Pegem Egit. ve Ogr. Derg., vol. 13, no. 1, pp. 234–240, 2022, doi: 10.47750/pegegog.13.01.25.
[3]A. Radloff and H. Coates, “Doing More for Learning: Enhancing Engagement and Outcomes. Australasian Survey of Student Engagement (AUSSE) Report,” 2010. [Online]. Available: https://www.acer.org/files/AUSSE_Australasian-Student-Engagement-Report-ASER-2009.pdf
[4]S. G. T. Ong and G. C. L. Quek, “Enhancing teacher–student interactions and student online engagement in an online learning environment,” Learn. Environ. Res., vol. 26, no. 3, pp. 681–707, 2023, doi: 10.1007/s10984-022-09447-5.
[5]H. Ross, Y. Cen, and Z. Zhou, “Assessing student engagement in China: Responding to local and global discourse on raising educational quality,” Curr. Issues Comp. Educ., vol. 14, no. 1, pp. 24–37, 2011.
[6]T. L. Hofkens and E. Ruzek, “Measuring student engagement to inform effective interventions in schools,” in Handbook of student engagement interventions: Working with disengaged students, S. L. Fredricks, J. A., Reschly, A. L., & Christenson, Ed., Elsevier Academic Press, 2019, pp. 309–324. doi: 10.1016/B978-0-12-813413-9.00021-8.
[7]J. A. Fredricks, P. C. Blumenfeld, and A. H. Paris, “School Engagement: Potential of the Concept, State of the Evidence,” Rev. Educ. Res., vol. 74, no. 1, pp. 59–109, 2004, [Online]. Available: https://doi.org/10.3102/00346543074001059
[8]S. G. Khenkar, S. K. Jarraya, A. Allinjawi, S. Alkhuraiji, N. Abuzinadah, and F. A. Kateb, “Deep Analysis of Student Body Activities to Detect Engagement State in E-Learning Sessions,” Appl. Sci., vol. 13, no. 4, 2023, doi: 10.3390/app13042591.
[9]I. Qarbal, N. Sael, and S. Ouahabi, “Student ’ s Engagement Detection Based on Computer Vision : A Systematic Literature Review,” IEEE Access, vol. 13, no. August, pp. 140519–140545, 2025, doi: 10.1109/ACCESS.2025.3596885.
[10]Q. Liu and X. Jiang, “Classroom Behavior Recognition Using Computer Vision : A Systematic Review,” Sensors, vol. 25, no. 2, p. 373, 2025, doi: Classrohttps://doi.org/10.3390/s25020373.
[11]S. Arefnejad, A. Khadivi, and F. Alipour, “Challenges and Applications of Artificial Intelligence in Education: A Systematic Review,” J. Knowledge-Research Stud., vol. 3, no. 4, p. 2024, 2024, doi: 10.22034/jkrs.2024.63182.1106.
[12]C. Berka, D. J. Levendowski, M. N. Lumicao, and A. Yau, “EEG correlates of task engagement and mental workload in vigilance, learning, and memory tasks,” Aviat. Space. Environ. Med., vol. 78, no. 5, pp. B231–B244, 2007.
[13]K. Yin, H. Bin Shin, D. Li, and S. W. Lee, “EEG-based Multimodal Representation Learning for Emotion Recognition,” Int. Winter Conf. Brain-Computer Interface, BCI, 2025, doi: 10.1109/BCI65088.2025.10931743.
[14]A. Sukumaran and A. Manoharan, “Student Engagement Recognition: Comprehensive Analysis Through EEG and Verification by Image Traits Using Deep Learning Techniques,” IEEE Access, vol. 13, no. January, pp. 11639–11662, 2025, doi: 10.1109/ACCESS.2025.3526187.
[15]L. Wei, Y. Yu, Y. Qin, and S. Zhang, “A Survey of EEG-Based Approaches to Classroom Attention Assessment in Education,” Information, vol. 16, no. 10, p. 860, 2025, doi: 10.3390/info16100860.
[16]S. K. D’Mello, E. Dieterle, and A. L. Duckworth, “Advanced, analytic, automated (AAA) Measurement of Engagement During Learning,” Educ. Psychol., vol. 52, no. 2, pp. 104–123, 2017, doi: 10.1080/00461520.2017.1281747.Advanced.
[17]J. D. T. Guerrero-sosa, F. P. Romero, V. H. Menéndez-domínguez, J. Serrano-guerrero, A. Montoro-montarroso, and J. A. Olivas, “A Comprehensive Review of Multimodal Analysis in Education,” Appl. Sci., vol. 15, no. 11, p. 5896, 2025.
[18]O. R. Yürüm, “Technology-Enhanced Multimodal Learning Analytics in Higher Education : A Systematic Literature Review,” vol. 13, no. May, pp. 92057–92073, 2025, doi: 10.1109/ACCESS.2025.3572467.
[19]H. Ouhaichi, D. Spikol, and B. Vogel, “Research trends in multimodal learning analytics: A systematic mapping study,” Comput. Educ. Artif. Intell., vol. 4, p. 100136, 2023, doi: https://doi.org/10.1016/j.caeai.2023.100136.
[20]N. Bergdahl, M. Bond, J. Sjöberg, M. Dougherty, and E. Oxley, “Unpacking student engagement in higher education learning analytics : a systematic review,” Int. J. Educ. Technol. High. Educ., 2024, doi: 10.1186/s41239-024-00493-y.
[21]K. Mangaroska, K. Sharma, D. Gašević, and M. Giannakos, “Multimodal learning analytics to inform learning design: Lessons learned from computing education,” J. Learn. Anal., vol. 7, no. 3, pp. 79–97, 2020, doi: 10.18608/JLA.2020.73.7.
[22]A. Sabuncuoglu and T. M. Sezgin, “Developing a Multimodal Classroom Engagement Analysis Dashboard for Higher-Education,” Proc. ACM Human-Computer Interact., vol. 7, no. EICS, 2023, doi: 10.1145/3593240.
[23]L. Zhang, J. L. Hung, X. Du, H. Li, and Z. Hu, “Multimodal Fast–Slow Neural Network for learning engagement evaluation,” Data Technol. Appl., vol. 57, no. 3, pp. 418–435, 2023, doi: 10.1108/DTA-05-2022-0199.
[24]K. Mallibhat, “Student attention detection using multimodal data fusion,” in 2024 IEEE International Conference on Advanced Learning Technologies (ICALT), IEEE, 2024, pp. 295–297. doi: 10.1109/ICALT61570.2024.00092.
[25]M. Mohammadi, E. Tajik, R. Martinez-maldonado, S. Sadiq, W. Tomaszewski, and H. Khosravi, “Artificial intelligence in multimodal learning analytics: A systematic literature review,” Comput. Educ. Artif. Intell., vol. 8, no. May, p. 100426, 2025, doi: 10.1016/j.caeai.2025.100426.
[26]S. Dikker et al., “Brain-to-Brain Synchrony Tracks Real-World Dynamic Group Interactions in the Classroom,” Curr. Biol., vol. 27, no. 9, pp. 1375–1380, 2017, doi: 10.1016/j.cub.2017.04.002.
[27]P. Chejara, L. P. Prieto, M. J. Rodriguez-Triana, A. Ruiz-Calleja, and M. Khalil, “Impact of window size on the generalizability of collaboration quality estimation models developed using Multimodal Learning Analytics,” ACM Int. Conf. Proceeding Ser., vol. 1, no. 1, pp. 559–565, 2023, doi: 10.1145/3576050.3576143.
[28]P. Antonenko, F. Paas, R. Grabner, and T. van Gog, “Using Electroencephalography to Measure Cognitive Load,” Educ. Psychol. Rev., vol. 22, no. 4, pp. 425–438, 2010, doi: 10.1007/s10648-010-9130-y.
[29]V. J. Lawhern, A. J. Solon, N. R. Waytowich, S. M. Gordon, C. P. Hung, and B. J. Lance, “EEGNet: A compact convolutional neural network for EEG-based brain-computer interfaces,” J. Neural Eng., vol. 15, no. 5, pp. 1–30, 2018, doi: 10.1088/1741-2552/aace8c.
[30]A. Abedi and S. S. Khan, “Improving state-of-the-art in Detecting Student Engagement with ResNet and TCN Hybrid Network,” in Proceedings - 2021 18th Conference on Robots and Vision, CRV 2021, IEEE, 2021, pp. 151–157. doi: 10.1109/CRV52889.2021.00028. 
[31]Z. Cao, G. Hidalgo, T. Simon, S. E. Wei, and Y. Sheikh, “OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 1, pp. 172–186, 2021, doi: 10.1109/TPAMI.2019.2929257.
[32]R. Artstein and M. Poesio, “Inter-coder agreement for computational linguistics,” Comput. Linguist., vol. 34, no. 4, pp. 555–596, 2008, doi: 10.1162/coli.07-034-R2.
[33]G. Y. Li, J. Chen, S. I. Jang, K. Gong, and Q. Li, “SwinCross: Cross-modal Swin transformer for head-and-neck tumor segmentation in PET/CT images,” Med. Phys., vol. 51, no. 3, pp. 2096–2107, 2024, doi: 10.1002/mp.16703.
[34]X. Jiang, “Deep Learning-Based Multimodal Fusion Algorithm for Assessing Online Learning Engagement,” in Proceedings of the 10th International Conference on Cyber Security and Information Engineering (ICCSIE 2025), Association for Computing Machinery (ACM), 2026, pp. 88–93. doi: 10.1145/3759179.3759192.
[35]D. Dresvyanskiy, A. Karpov, and W. Minker, “A Cross-Multi-modal Fusion Approach for Enhanced Engagement Recognition BT - Speech and Computer,” in Speech and Computer: 26th International Conference, SPECOM 2024, A. Karpov and V. Delić, Eds., Cham: Springer Nature Switzerland, 2024, pp. 3–17. doi: https://doi.org/10.1007/978-3-031-78014-1_1.
[36]J. M. Johnson and T. M. Khoshgoftaar, “Survey on deep learning with class imbalance,” J. Big Data, vol. 6, no. 1, 2019, doi: 10.1186/s40537-019-0192-5.
[37]D. P. Kingma and J. L. Ba, “Adam: A method for stochastic optimization,” 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc., pp. 1–15, 2015.
[38]I. Loshchilov and F. Hutter, “SGDR: Stochastic gradient descent with warm restarts,” 5th Int. Conf. Learn. Represent. ICLR 2017 - Conf. Track Proc., pp. 1–16, 2017.
[39]I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT review, vol. 22, no. 4. 2016.
[40]E. Fan, M. Bower, and J. Siemon, “From heartbeats to actions : Multimodal learning analytics of cognitive and behavior engagement in real classrooms,” Learn. Instr., vol. 103, no. January, p. 102325, 2026, doi: 10.1016/j.learninstruc.2026.102325.
[41]C. Li, X. Weng, Y. Li, and T. Zhang, “Multimodal Learning Engagement Assessment System: An Innovative Approach to Optimizing Learning Engagement,” Int. J. Human–Computer Interact., vol. 41, no. 5, pp. 3474–3490, Mar. 2025, doi: 10.1080/10447318.2024.2338616.