Victoria Vysotska; Alina Starchenko; Lyubomyr Chyrun; Zhengbing Hu; Yuriy Ushenko; Dmytro Uhryn

Sentiment Analysing and Visualising Public Opinion on Political Figures across YouTube and Twitter Using NLP and Machine Learning

PDF (2187KB), PP.117-164

Views: 0 Downloads: 0

Author(s)

Victoria Vysotska ¹ Alina Starchenko ¹ Lyubomyr Chyrun ² Zhengbing Hu ³ Yuriy Ushenko ^4,* Dmytro Uhryn ⁴

1. Department of Information Systems and Networks, Lviv Polytechnic National University, Lviv, 79013, Ukraine

2. Ivan Franko National University of Lviv, Lviv, 79000, Ukraine

3. School of Computer Science, Hubei University of Technology, Wuhan, China

4. Department of Computer Science, Yuriy Fedkovych Chernivtsi National University, Chernivtsi, 58012, Ukraine

* Corresponding author.

DOI: https://doi.org/10.5815/ijigsp.2025.05.08

Received: 12 Mar. 2025 / Revised: 22 May 2025 / Accepted: 9 Jul. 2025 / Published: 8 Oct. 2025

Index Terms

Sentiment Analysis, Public Opinion, Social Networks, Twitter, Youtube, Ukrainian-Language Content, Natural Language Processing, NLP, Machine Learning

Abstract

The study is devoted to the analysis of public sentiment towards Ukrainian political figures based on comments on social media, in particular, YouTube and Twitter. The work aims to identify differences in the perception of political leaders and to understand how the platform affects the tone of statements. The main research question is to determine how public opinion about politicians in Ukraine differs between YouTube and Twitter during the full-scale war. To do this, a corpus of comments and tweets from 2022 to 2023 was collected, which went through pre-processing stages (including cleaning up slang and spelling mistakes). The article presents the results of a comprehensive analysis of public opinion on five public figures of Ukraine (S. Prytula, P. Poroshenko, V. Zelensky, S. Sternenko, A. Yermak) based on data from the social networks YouTube and Twitter. For data collection, the YouTube Data API and the Apify platform were used, a corpus of Ukrainian-language comments and tweets was collected and processed, which went through the stages of purification, normalisation and lemmatisation, taking into account slang, surzhyk and spelling mistakes. The sentiment analysis model, built on the basis of multilingual-e5-base embeddings and the XGBClassifier algorithm, showed an accuracy of 89.4%, macro-F1 of 88.7%, and a weighted F1 of 89.1%. Sentiment distribution analysis revealed that, on average, 42% of messages were positive, 36% were negative, and 22% were neutral. Twitter had a higher share of negative statements (up to 40%), while YouTube had a predominance of positive sentiment (up to 47%). The results indicate differences in the perception of public figures on different platforms and confirm the effectiveness of the developed approach for the Ukrainian-speaking segment of social networks. The results indicate significant differences in sentiment distribution: comments on YouTube are more likely to be marked by emotional intensity and harshness. At the same time, Twitter exhibits a more concise but no less polarised discourse. One of the reasons for this difference may be the difference in the format of the platforms, their audience, and the speed of content distribution. Further research should take into account the impact of user demographic biases, as well as the activity of bots or coordinated campaigns that can change the perception of public opinion. The practical significance of the study lies in the fact that its results can be used by politicians, journalists, and public figures to better understand the mood of society, predict reactions to political events, and build more effective communication. At the same time, it is worth noting that there are limitations: automated sentiment analysis has difficulty detecting sarcasm, irony, or context-sensitive meanings, which can affect the Accuracy of the results. In addition, the study takes into account the ethical aspects of data collection and analysis: only publicly available comments were used, without interference in the private sphere of users. There are possible risks of abuse of such technologies, and the need for responsible application of the findings is emphasised.

Cite This Paper

Victoria Vysotska, Alina Starchenko, Lyubomyr Chyrun, Zhengbing Hu, Yuriy Ushenko, Dmytro Uhryn, "Sentiment Analysing and Visualising Public Opinion on Political Figures across YouTube and Twitter Using NLP and Machine Learning", International Journal of Image, Graphics and Signal Processing(IJIGSP), Vol.17, No.5, pp. 117-164, 2025. DOI:10.5815/ijigsp.2025.05.08

Reference

[1]A. A. Ate, J. I. Chiadika, J. E. Nwadiwe, and S. S. Ekene, “Use of social media and digital strategies in political campaigns,” International Journal of Novel Research and Development, 2023. [Online]. Available: https://www.researchgate.net/publication/375512940_USE_OF_SOCIAL_MEDIA_AND_DIGITAL_STRATEGIES_IN_POLITICAL_CAMPAIGNS. Accessed: Jun. 4, 2025.
[2]A. Zakharchenko, Y. Maksimtsova, V. Iurchenko, V. Shevchenko, and S. Fedushko, “Under the conditions of non-agenda ownership: Social media users in the 2019 Ukrainian presidential elections campaign,” arXiv preprint arXiv:1909.01681, 2019. [Online]. Available: https://arxiv.org/abs/1909.01681. Accessed: Jun. 4, 2025.
[3]V. Vysotska, M. Nazarkevych, S. Vladov, O. Lozynska, O. Markiv, R. Romanchuk, and V. Danylyk, “Devising a method for detecting information threats in the Ukrainian cyber space based on machine learning,” Eastern-European Journal of Enterprise Technologies, vol. 132, no. 2, 2024. doi: 10.15587/1729-4061.2024.317456.
[4]V. Vysotska, S. Holoshchuk, and R. Holoshchuk, “A comparative analysis for English and Ukrainian texts processing based on semantics and syntax approach,” in Proc. COLINS, 2021, pp. 311–356. [Online]. Available: https://ceur-ws.org/Vol-2870/paper26.pdf.
[5]B. S. Bello, I. Inuwa-Dutse, and R. Heckel, “Social media campaign strategies: Analysis of the 2019 Nigerian elections,” in Proc. 2019 Sixth Int. Conf. Social Networks Analysis, Management and Security (SNAMS), 2019, pp. 142–149. IEEE.
[6]I. V. Melnyk and O. P. Kovalchuk, “Machine learning methods and design of a system for determining the emotional coloring of Ukrainian-language content,” Bulletin of the National University "Lviv Polytechnic", 2024. [Online]. Available: https://science.lpnu.ua/sites/default/files/journal-paper/2024/aug/35646/maket2402951-78-90.pdf. Accessed: Aug. 10, 2025.
[7]C. J. Hutto and E. Gilbert, “VADER: A parsimonious rule-based model for sentiment analysis of social media text,” in Proc. 8th Int. AAAI Conf. Weblogs and Social Media (ICWSM), Ann Arbor, MI, USA, 2014, pp. 216–225.
[8]M. Thelwall, K. Buckley, G. Paltoglou, D. Cai, and A. Kappas, “Sentiment strength detection in short informal text,” J. Amer. Soc. Inf. Sci. Technol., vol. 61, no. 12, pp. 2544–2558, 2010.
[9]S. Zhabotynska and A. Brynko, “Emotive lexicon of the political narrative: Ukraine and the West in Chinese media,” Cognition, Communication, Discourse, no. 25, pp. 89–118, 2022.
[10]V. Vysotska, P. Pukach, V. Lytvyn, D. Uhryn, Y. Ushenko, and Z. Hu, “Intelligent analysis of Ukrainian-language tweets for public opinion research based on NLP methods and machine learning technology,” Int. J. Modern Education and Computer Science (IJMECS), vol. 15, no. 3, pp. 70–93, 2023.
[11]M. Prytula, “Fine-tuning BERT, DistilBERT, XLM-RoBERTa and Ukr-RoBERTa models for sentiment analysis of Ukrainian language reviews,” Machine Learning, vol. 3, no. 4, 2024.
[12]Z. Sokolová, M. Harahus, J. Juhár, M. Pleva, J. Staš, and D. Hládek, “Comparison of machine learning approaches for sentiment analysis in Slovak,” Electronics, vol. 13, no. 4, p. 703, 2024.
[13]Y. Zhang et al., “Center-left and right-wing news, YouTube, and Twitter as key connectors in the social media system: A cross-media and cross-platform analysis of hyperlinks,” Journal of Quantitative Description: Digital Media, vol. 5, 2025.
[14]B. Smart, J. Watt, S. Benedetti, L. Mitchell, and M. Roughan, “#IStandWithPutin versus #IStandWithUkraine: The interaction of bots and humans in discussion of the Russia/Ukraine war,” in Proc. Int. Conf. Social Informatics, 2022, pp. 34–53. Cham: Springer.
[15]M. Zanchak, et al., “The sarcasm detection in news headlines based on machine learning technology,” in Proc. 2021 IEEE 16th Int. Conf. Computer Sciences and Information Technologies (CSIT), vol. 1, 2021, pp. 131–137. IEEE.
[16]V. Vysotska, “Computer linguistic systems design and development features for Ukrainian language content processing,” in Proc. COLINS, 2024, pp. 229–271. [Online]. Available: https://ceur-ws.org/Vol-3688/paper18.pdf.
[17]D. Dementieva, N. Babakov, and A. Fraser, “EmoBench-UA: A benchmark dataset for emotion detection in Ukrainian,” arXiv preprint arXiv:2505.23297, 2025. [Online]. Available: https://arxiv.org/abs/2505.23297.
[18]Y. Shynkarov and V. Solopova, “High-quality sentiment analysis model for Ukrainian social media,” 2025. [Online]. Available: https://apps.ucu.edu.ua/wp-content/uploads/2025/06/MS_AMLV_2025_camera_ready_paper-Y.Shynkarov-and-V.Solopova.pdf.
[19]A. Gaurav, A. Kumar, S. Raj, P. Sharma, and P. Sharma, “XLM-RoBERTa based sentiment analysis of tweets on metaverse and 6G,” Procedia Computer Science, vol. 238, pp. 902–907, 2024. [Online]. Available: https://doi.org/10.1016/j.procs.2024.10.119. Accessed: Aug. 10, 2025.
[20]A. Shevtsov, M. Oikonomidou, D. Antonakaki, P. Pratikakis, and S. Ioannidis, “What tweets and YouTube comments have in common? Sentiment and graph analysis on data related to US elections 2020,” PLOS ONE, vol. 18, no. 1, e0270542, 2023. doi: 10.1371/journal.pone.0270542.
[21]B. Breve, L. Caruccio, S. Cirillo, et al., “Analyzing the worldwide perception of the Russia-Ukraine conflict through Twitter,” J. Big Data, vol. 11, p. 76, 2024. doi: 10.1186/s40537-024-00921-w.
[22]Talkwalker, [Online]. Available: https://www.talkwalker.com/.
[23]Sprout Social, [Online]. Available: https://sproutsocial.com/.
[24]Brandwatch, [Online]. Available: https://www.brandwatch.com/.
[25]MonkeyLearn, [Online]. Available: https://welcome.ai/solution/monkeylearn.
[26]SentiStrength, [Online]. Available: https://mi-linux.wlv.ac.uk/~cm1993/sentistrength/.
[27]TextBlob, [Online]. Available: https://textblob.readthedocs.io/en/dev/.
[28]VADER GitHub repository, [Online]. Available: https://github.com/cjhutto/vaderSentiment.
[29]Ukrainian Sentiment Analysis GitHub repository, [Online]. Available: https://github.com/skupriienko/Ukrainian-Sentiment-Analysis.
[30]Awesome Ukrainian NLP, [Online]. Available: https://github.com/osyvokon/awesome-ukrainian-nlp.
[31]UNLP 2025: The Fourth Ukrainian Natural Language Processing Workshop, [Online]. Available: https://unlp.org.ua/.
[32]Brown-uk Ukrainian Dictionary GitHub repository, [Online]. Available: https://github.com/brown-uk/dict_uk.
[33]Multilingual E5 Embedding Model, [Online]. Available: https://huggingface.co/intfloat/multilingual-e5-base.
[34]P. Jansen, “TIOBE index for June 2025,” 2025. [Online]. Available: https://www.tiobe.com/tiobe-index/.
[35]Bomberbot, “R vs Python for data science: An in-depth comparison for 2025,” 2024. [Online]. Available: https://www.bomberbot.com/data-science/r-vs-python-for-data-science-an-in-depth-comparison-for-2024/.
[36]PWSkills, “Python vs Java: Which is better for machine learning in 2024?,” 2025. [Online]. Available: https://pwskills.com/blog/python-vs-java-which-is-better-for-machine-learning-in-2024/.
[37]A. Dmytriv, et al., “The speech parts identification for Ukrainian words based on VESUM and Horokh using,” in Proc. 2021 IEEE 16th Int. Conf. Computer Sciences and Information Technologies (CSIT), vol. 2, pp. 21–33, 2021. IEEE. [Online]. Available: https://doi.org/10.1109/CSIT52700.2021.9648758. Accessed: Aug. 10, 2025.
[38]About the X API, [Online]. Available: https://docs.x.com/x-api/getting-started/about-x-api.
[39]Tweet Scraper V2 - X/Twitter Scraper, [Online]. Available: https://console.apify.com/actors/61RPP7dywgiy0JPD0/input.

International Journal of Image, Graphics and Signal Processing (IJIGSP)

MECS Press Journal