Darshana S. Gajbhiye

Work place: Xavier Institute of Engineering, Mumbai, 400016, India

E-mail: arshanatambe.phdwork@gmail.com

Website: https://orcid.org/0009-0000-0056-2597

Research Interests:

Biography

Dr. Darshana S. Gajbhiye has received her Ph. D. degree from Mumbai University, Mumbai in 2025She is currently working as Assistant Professor, Department of Information Technology at Xavier Institute of Engineering, Mumbai. Her teaching and research areas span Data Mining, Business Intelligence, Software Engineering, Cyber Security, and AI-driven Bug Prediction and Localization. She has published extensively in reputed journals and international conferences

Author Articles
Benchmarking Linear, Ensemble, and Neural Models for Entity-Level Sentiment Analysis on Twitter Data

By Kavita R. Shelke Malcolm Alex Raj Darshana S. Gajbhiye Rakhi D. Akhare

DOI: https://doi.org/10.5815/ijem.2026.03.22, Pub. Date: 8 Jun. 2026

Social media streams reflect opinion of public in real-time, but short and noisy tweets make it hard to attribute sentiment to entities and this paper introduces an AI/ML pipeline to classify the sentiment towards the referenced entities in twitter messages. Using the Twitter Entity Sentiment Analysis benchmark (twitter_training.csv and twitter_validation.csv), the tweets are normalized (lowercasing, punctuation, platform specific artifacts, tokenization, stop word filtering and lemmatization) and represented using TF-IDF (Term Frequency–Inverse Document Frequency) features with a maximum of 5000 terms. Machine learning models including Logistic Regression, linear SVM, Multinomial Naive Bayes, and ensemble and neural methods such as Random Forest, XGBoost, and Multilayer Perceptron (MLP) are trained on the training split and evaluated on the validation split using macro-averaged precision, recall, F1-score, and confusion matrix analysis. The results show that linear discriminative models are well suited to sparse TF-IDF spaces, with SVM and Logistic Regression providing balanced class-wise behaviour and Naive Bayes offering a computationally efficient baseline. XGBoost delivers moderate improvements over simple probabilistic models, while Random Forest achieves substantial gains through ensemble learning. The best overall performance is obtained by MLP, demonstrating that non-linear neural modeling more effectively captures complex feature interactions and entity-relevance patterns. Misclassifications are focused on the Neutral - Irrelevant boundary, resulting in instances of relevance ambiguity at the entity level in the case of sentiment and driving future extensions with context aware deep architectures and entity conditioned representations. These baselines provide support for monitoring purposes for brands and public figures as well as expose the limits of non-contextual features for sarcasm and implicit targets in Twitter discourse.

[...] Read more.
Other Articles