Malcolm Alex Raj

Work place: Agnel Charities, Fr. C. Rodrigues Institute of Technology, Navi Mumbai, 400703, India

E-mail: malcolmraj2072@gmail.com

Website: https://orcid.org/0009-0002-0618-021X

Research Interests:

Biography

Malcolm Alex Raj is a undergraduate student in Computer Engineering with an interest for Artificial Intelligence and Machine Learning. He has worked on many AIML related projects, creating a better solution to problems. 

Author Articles
Benchmarking Linear, Ensemble, and Neural Models for Entity-Level Sentiment Analysis on Twitter Data

By Kavita R. Shelke Malcolm Alex Raj Darshana S. Gajbhiye Rakhi D. Akhare

DOI: https://doi.org/10.5815/ijem.2026.03.22, Pub. Date: 8 Jun. 2026

Social media streams reflect opinion of public in real-time, but short and noisy tweets make it hard to attribute sentiment to entities and this paper introduces an AI/ML pipeline to classify the sentiment towards the referenced entities in twitter messages. Using the Twitter Entity Sentiment Analysis benchmark (twitter_training.csv and twitter_validation.csv), the tweets are normalized (lowercasing, punctuation, platform specific artifacts, tokenization, stop word filtering and lemmatization) and represented using TF-IDF (Term Frequency–Inverse Document Frequency) features with a maximum of 5000 terms. Machine learning models including Logistic Regression, linear SVM, Multinomial Naive Bayes, and ensemble and neural methods such as Random Forest, XGBoost, and Multilayer Perceptron (MLP) are trained on the training split and evaluated on the validation split using macro-averaged precision, recall, F1-score, and confusion matrix analysis. The results show that linear discriminative models are well suited to sparse TF-IDF spaces, with SVM and Logistic Regression providing balanced class-wise behaviour and Naive Bayes offering a computationally efficient baseline. XGBoost delivers moderate improvements over simple probabilistic models, while Random Forest achieves substantial gains through ensemble learning. The best overall performance is obtained by MLP, demonstrating that non-linear neural modeling more effectively captures complex feature interactions and entity-relevance patterns. Misclassifications are focused on the Neutral - Irrelevant boundary, resulting in instances of relevance ambiguity at the entity level in the case of sentiment and driving future extensions with context aware deep architectures and entity conditioned representations. These baselines provide support for monitoring purposes for brands and public figures as well as expose the limits of non-contextual features for sarcasm and implicit targets in Twitter discourse.

[...] Read more.
Other Articles