Work place: Department of Computer Science and Engineering, GITAM School of Technology, GITAM (Deemed to be University), Visakhapatnam, India
E-mail: rgedela@gitam.edu
Website: https://orcid.org/0000-0002-4454-0581
Research Interests:
Biography
Ravi Teja Gedela obtained his Ph.D. in Computer Science and Engineering from the National Institute of Technology Silchar, Assam, in 2024. He completed his M.Tech and B.Tech in Computer Science and Engineering from JNTU Kakinada in 2012 and 2010, respectively. He works as an Assistant Professor in the Department of Computer Science and Engineering at GITAM (Deemed to be University), Andhra Pradesh, India. Prior to this, he served in various academic roles at reputed engineering institutions. His research focuses on Natural Language Processing, with a special focus on sarcasm detection in multilingual contexts, along with interests in Machine Learning and Deep Learning applications. He has published several SCIE-indexed journal articles and conference papers with Springer and IEEE. Dr. Gedela has also qualified for UGC-NET multiple times and is actively involved in mentoring and collaborative research.
By Ravi Teja Gedela J. N. V. R. Swarup Kumar Venkateswararao Kuna Sasibhushana Rao Pappu
DOI: https://doi.org/10.5815/ijmecs.2025.06.08, Pub. Date: 8 Dec. 2025
Sarcasm, a subtle form of expression, is challenging to detect, especially in modern communication platforms where communication transcends text to encompass videos, images, and audio. Traditional sarcasm detection methods rely solely on textual data and often struggle to capture the nuanced emotional inconsistencies inherent in sarcastic remarks. To overcome these shortcomings, this paper introduces a novel multimodal framework incorporating text, audio, and emoji data for more effective sarcasm detection and emotion classification. A key component of this framework is the Contextualized Semantic Self-Guided BERT (CS-SGBERT) model, which generates efficient word embeddings. Primarily, frequency spectral analysis is performed on the audio data, followed by preprocessing and feature extraction, while text data undergoes preprocessing to extract lexicon and irony features. Meanwhile, emojis are analyzed for polarity scores, which provide a rich set of multimodal features. The fused features are then optimized using the Camberra-based Dingo Optimization Algorithm (C-DOA). The selected features and the embedded words from the preprocessed texts are given to Entropy-based Robust Scaling - Gated Recurrent Units (E-RS-GRU) for detecting sarcasm. Experimental results on the MUStARD dataset show that the proposed E-RS-GRU model achieves an accuracy of 76.65% and F1-score of 76.9%, with a relative improvement of 2.18% over the best-performing baseline and 1.25% over the best-performing state-of-the-art model. Additionally, KLKI-Fuzzy model is proposed for emotion recognition, which dynamically adjusts membership functions through Kullback-Leibler Kriging Interpolation (KLKI), enhancing emotion classification by processing features from all modalities. The KLKI-Fuzzy model exhibits enhanced emotion recognition performance with reduced fuzzification and defuzzification times.
[...] Read more.Subscribe to receive issue release notifications and newsletters from MECS Press journals