Kavita R. Shelke

Work place: Agnel Charities, Fr. C. Rodrigues Institute of Technology, Navi Mumbai, 400703, India

E-mail: kavita.shelke@fcrit.ac.in

Website: https://orcid.org/0000-0002-6710-2736

Research Interests:

Biography

Kavita R. Shelke is a Research scholar and Assistant Professor in Computer Engineering at Fr. C. Rodrigues Institute of Technology, Vashi, Navi Mumbai. She has completed her M.E. Computer Engineering in 2013. She has more than 15 years’ experience in the field of academics. She has published about 18 papers in International Journals and Conferences. Her research area includes Blockchain, Artificial Intelligence, Computational Algorithms.

Author Articles
Benchmarking Linear, Ensemble, and Neural Models for Entity-Level Sentiment Analysis on Twitter Data

By Kavita R. Shelke Malcolm Alex Raj Darshana S. Gajbhiye Rakhi D. Akhare

DOI: https://doi.org/10.5815/ijem.2026.03.22, Pub. Date: 8 Jun. 2026

Social media streams reflect opinion of public in real-time, but short and noisy tweets make it hard to attribute sentiment to entities and this paper introduces an AI/ML pipeline to classify the sentiment towards the referenced entities in twitter messages. Using the Twitter Entity Sentiment Analysis benchmark (twitter_training.csv and twitter_validation.csv), the tweets are normalized (lowercasing, punctuation, platform specific artifacts, tokenization, stop word filtering and lemmatization) and represented using TF-IDF (Term Frequency–Inverse Document Frequency) features with a maximum of 5000 terms. Machine learning models including Logistic Regression, linear SVM, Multinomial Naive Bayes, and ensemble and neural methods such as Random Forest, XGBoost, and Multilayer Perceptron (MLP) are trained on the training split and evaluated on the validation split using macro-averaged precision, recall, F1-score, and confusion matrix analysis. The results show that linear discriminative models are well suited to sparse TF-IDF spaces, with SVM and Logistic Regression providing balanced class-wise behaviour and Naive Bayes offering a computationally efficient baseline. XGBoost delivers moderate improvements over simple probabilistic models, while Random Forest achieves substantial gains through ensemble learning. The best overall performance is obtained by MLP, demonstrating that non-linear neural modeling more effectively captures complex feature interactions and entity-relevance patterns. Misclassifications are focused on the Neutral - Irrelevant boundary, resulting in instances of relevance ambiguity at the entity level in the case of sentiment and driving future extensions with context aware deep architectures and entity conditioned representations. These baselines provide support for monitoring purposes for brands and public figures as well as expose the limits of non-contextual features for sarcasm and implicit targets in Twitter discourse.

[...] Read more.
Other Articles