Work place: Agnel Charities, Fr. C. Rodrigues Institute of Technology, Navi Mumbai, 400703, India
E-mail: malcolmraj2072@gmail.com
Website: https://orcid.org/0009-0002-0618-021X
Research Interests:
Biography
Malcolm Alex Raj is a undergraduate student in Computer Engineering with an interest for Artificial Intelligence and Machine Learning. He has worked on many AIML related projects, creating a better solution to problems.
By Kavita R. Shelke Malcolm Alex Raj Darshana S. Gajbhiye Rakhi D. Akhare
DOI: https://doi.org/10.5815/ijem.2026.03.22, Pub. Date: 8 Jun. 2026
Social media streams reflect opinion of public in real-time, but short and noisy tweets make it hard to attribute sentiment to entities and this paper introduces an AI/ML pipeline to classify the sentiment towards the referenced entities in twitter messages. Using the Twitter Entity Sentiment Analysis benchmark (twitter_training.csv and twitter_validation.csv), the tweets are normalized (lowercasing, punctuation, platform specific artifacts, tokenization, stop word filtering and lemmatization) and represented using TF-IDF (Term Frequency–Inverse Document Frequency) features with a maximum of 5000 terms. Machine learning models including Logistic Regression, linear SVM, Multinomial Naive Bayes, and ensemble and neural methods such as Random Forest, XGBoost, and Multilayer Perceptron (MLP) are trained on the training split and evaluated on the validation split using macro-averaged precision, recall, F1-score, and confusion matrix analysis. The results show that linear discriminative models are well suited to sparse TF-IDF spaces, with SVM and Logistic Regression providing balanced class-wise behaviour and Naive Bayes offering a computationally efficient baseline. XGBoost delivers moderate improvements over simple probabilistic models, while Random Forest achieves substantial gains through ensemble learning. The best overall performance is obtained by MLP, demonstrating that non-linear neural modeling more effectively captures complex feature interactions and entity-relevance patterns. Misclassifications are focused on the Neutral - Irrelevant boundary, resulting in instances of relevance ambiguity at the entity level in the case of sentiment and driving future extensions with context aware deep architectures and entity conditioned representations. These baselines provide support for monitoring purposes for brands and public figures as well as expose the limits of non-contextual features for sarcasm and implicit targets in Twitter discourse.
[...] Read more.Subscribe to receive issue release notifications and newsletters from MECS Press journals