Work place: Department of Computer Science and Information Technology, K L University, Vaddeswaram, Guntur District 522302, India
E-mail: sampathpatchigolla@gmail.com
Website: https://orcid.org/0009-0002-0121-4564
Research Interests:
Biography
Sampath Patchigolla is a student with academic experience at KL University and GITAM Deemed to be University, where he pursued studies in engineering and technology. His academic engagement has included project work and internships, notably at Elitelogix Exim Agency, gaining practical exposure in finance and management along with his technical education. Sampath has been involved in research activities related to machine learning, facial emotion recognition, and computational intelligence as part of collaborative academic projects, contributing to innovations in deep learning applications. He embodies a commitment to interdisciplinary learning and research, aspiring to advance in technical fields by leveraging his educational foundation and practical experiences to solve real-world problems effectively.
By Srinivas P. V. V. S. Shaik Nazeera Khamar Nohith Borusu Mohan Guru Raghavendra Kota Harika Vuyyuru Sampath Patchigolla
DOI: https://doi.org/10.5815/ijigsp.2026.01.07, Pub. Date: 8 Feb. 2026
In the field of affective computing research, multi-modal emotion detection has gained popularity as a way to boost recognition robustness and get around the constraints of processing a multiple type of data. Human emotions are utilized for defining a variety of methodologies, including physiological indicators, facial expressions, as well as neuroimaging tactics. Here, a novel deep attention mechanism is used for detecting multi-modal emotions. Initially, the data are collected from audio and video features. For dimensionality reduction, the audio features are extracted using Constant-Q chromagram and Mel-Frequency Cepstral Coefficients (MM-FC2). After extraction, the audio generation is carried out by a Convolutional Dense Capsule Network (Conv_DCN) is used. Next is video data; the key frame extraction is carried out using Enhanced spatial-temporal and Second-Order Gaussian kernels. Here, Second-Order Gaussian kernels are a powerful tool for extracting features from video data and converting it into a format suitable for image-based analysis. Next, for video generation, DenseNet-169 is used. At last, all the extracted features are fused, and emotions are detected using a Weighted Late Fusion Deep Attention Neural Network (WLF_DAttNN). Python tool is used for implementation, and the performance measure achieved an accuracy of 97% for RAVDESS and 96% for CREMA-D dataset.
[...] Read more.Subscribe to receive issue release notifications and newsletters from MECS Press journals