Work place: Computer Science Department, Faculty of Computers and Information, Menoufia University, Shebin Elkom 32511, Egypt
E-mail: Rania_anwer@hotmail.com
Website:
Research Interests: Big Data
Biography
Rania A. Anwer received her BSc in Electrical Engineering from the Communication and Electronics department at Zagazig University, Shubra Faculty of Engineering, in 2004. Her research interests include software engineering, Machine Learning, Big Data, and Security.
By Rania Ahmed Mahmoud Hussein Arabi Keshk
DOI: https://doi.org/10.5815/ijitcs.2026.01.05, Pub. Date: 8 Feb. 2026
In the field of human-computer interaction, identifying emotion from speech and understanding the full context of spoken communication is a challenging task due to the imprecise nature of emotion, which requires detailed speech analysis. In the area of speech emotion recognition, various techniques have been employed to extract emotions from audio signals, including several well-established speech analysis and classification methods. Despite numerous advancements in recent years, many studies still fail to consider the semantic information present in speech. Our study proposes a novel approach that captures both the paralinguistic and semantic aspects of the speech signal by combining state-of-the-art machine learning techniques with carefully crafted feature extraction strategies. We address this task using feature-engineering-based techniques, which involve extracting meaningful audio features such as energy, pitch, harmonics, pauses, central momentum, chroma, zero-crossing rate, and Mel-frequency cepstral coefficients (MFCCs). These features capture important acoustic patterns that help the model learn emotional cues more effectively. This work is primarily conducted on the IEMOCAP dataset, a large and well-annotated emotional speech corpus. By framing our task as a multi-class classification problem, we extract 15 features from the audio signal and use them to train five machine learning classifiers. Additionally, we incorporate text-domain features to reduce ambiguity in emotional interpretation. We evaluate our model's performance using accuracy, precision, recall, and F-score across all experiments.
[...] Read more.Subscribe to receive issue release notifications and newsletters from MECS Press journals