Work place: Department of Computer Applications, Kalasalingam Academy of Research and Education, Krishnankovil, TamilNadu, India
E-mail: heiram85@gmail.com
Website:
Research Interests:
Biography
Mr. R. Ramkumar https://orcid.org/0009-0006-5300-5181 (ORCID ID) is a Research Scholar in Department of Computer Applications of Kalasalingam Academy of Research and Education, Krishnankoil, TamilNadu, India. Worked as Assistant Professor in Kaliswari College (Autonomous), Sivakasi, TamilNadu. Before worked as Assistant Professor in Sri Krishnasamy Arts and Science College, Sattur, TamilNadu. Worked as Assistant Professor in PSR Engineering College, Sivakasi, TamilNadu. Completed MCA at 2009 in Arulmigu Kalasalingam College of Engineering (Anna University) and B.Sc. in Computer Science at 2006 in Ayya Nadar Janaki Ammal College (Madurai Kamaraj University), Sivakasi.
By Ramkumar. R. Sureshkumar Nagarajan Dinesh Prasanth Ganapathi
DOI: https://doi.org/10.5815/ijitcs.2025.06.02, Pub. Date: 8 Dec. 2025
In Artificial Intelligence, voice categorization is important for various applications. Tamil, being one of the oldest languages in the world, comprises rich regional slang differing in tone, pronunciation, and emotive expression. These slang words are difficult to categorize because they are informal and there is limited annotated audio data. This study proposes an enhanced deep learning framework for Tamil slang classification using a balanced audio corpus. The framework integrates data-specific pre-processing techniques, including Mel spectrograms, Chroma features and spectral contrast, to capture the nuanced characteristics of Tamil speech. A DenseNet backbone, combined with LSTM and GRU layers, models both temporal and spectral information. The suggested FRAE-PSA module is an innovative application of the Pyramid Split Attention (PSA) mechanism adapted to support regional and affective variations of speech. Different from current PSA or Transformer-based approaches, FRAE-PSA splits the audio frequency spectrum and adapts attention weights dynamically based on auxiliary tasks. A multi-branch architecture is employed to fuse temporal and spectral features effectively and multi-task learning is used to enhance regional accent and emotion detection. Custom loss functions and lightweight networks optimize model efficiency. Experimental results show up to a 15% improvement in classification accuracy over baseline models, demonstrating the framework's effectiveness for real-world Tamil slang classification tasks.
[...] Read more.Subscribe to receive issue release notifications and newsletters from MECS Press journals