Work place: Lokmanya Tilak College of Engineering, Navi Mumbai, 400 079, India
E-mail: rakhi.akhare@ltce.in
Website: https://orcid.org/0000-0002-4172-6671
Research Interests:
Biography
Dr. Rakhi D. Akhare received her Ph.D. Degree in the Department of Computer Engineering from University of Mumbai India, in 2025. She is currently working as an Assistant professor at Lokmanya Tilak College of Engineering, Navi Mumbai India. Her current Research interest includes Big Data analytics, Data Mining, and Machine Learning. She has published works on Deep Learning and NLP based Video Summarization.
By Kavita R. Shelke Malcolm Alex Raj Darshana S. Gajbhiye Rakhi D. Akhare
DOI: https://doi.org/10.5815/ijem.2026.03.22, Pub. Date: 8 Jun. 2026
Social media streams reflect opinion of public in real-time, but short and noisy tweets make it hard to attribute sentiment to entities and this paper introduces an AI/ML pipeline to classify the sentiment towards the referenced entities in twitter messages. Using the Twitter Entity Sentiment Analysis benchmark (twitter_training.csv and twitter_validation.csv), the tweets are normalized (lowercasing, punctuation, platform specific artifacts, tokenization, stop word filtering and lemmatization) and represented using TF-IDF (Term Frequency–Inverse Document Frequency) features with a maximum of 5000 terms. Machine learning models including Logistic Regression, linear SVM, Multinomial Naive Bayes, and ensemble and neural methods such as Random Forest, XGBoost, and Multilayer Perceptron (MLP) are trained on the training split and evaluated on the validation split using macro-averaged precision, recall, F1-score, and confusion matrix analysis. The results show that linear discriminative models are well suited to sparse TF-IDF spaces, with SVM and Logistic Regression providing balanced class-wise behaviour and Naive Bayes offering a computationally efficient baseline. XGBoost delivers moderate improvements over simple probabilistic models, while Random Forest achieves substantial gains through ensemble learning. The best overall performance is obtained by MLP, demonstrating that non-linear neural modeling more effectively captures complex feature interactions and entity-relevance patterns. Misclassifications are focused on the Neutral - Irrelevant boundary, resulting in instances of relevance ambiguity at the entity level in the case of sentiment and driving future extensions with context aware deep architectures and entity conditioned representations. These baselines provide support for monitoring purposes for brands and public figures as well as expose the limits of non-contextual features for sarcasm and implicit targets in Twitter discourse.
[...] Read more.Subscribe to receive issue release notifications and newsletters from MECS Press journals