Early Detection of Stress and Anxiety Using NLP and Machine Learning on Social Media Data

PDF (2628KB), PP.70-94

Views: 0 Downloads: 0

Author(s)

Ravi Arora 1,2 S. V. A. V. Prasad 1 Arvind Rehalia 3 Nikhil Kaushik 4 Anil Kumar 5,*

1. Lingaya Vidyapeeth, Faridabad, Haryana 121002, India

2. Department of Information Technology, Bharati Vidyapeeth's College of Engineering, New Delhi, Delhi, 110063, India

3. Department of Information Technology, Bharati Vidyapeeth’s College of Engineering, New Delhi, Delhi, 110063, India

4. Department of Electronics and Communication Engineering, Lingaya Vidyapeeth, Faridabad, Haryana 121002, India

5. Department of Computer Science and Engineering, Bharati Vidyapeeth's College of Engineering, New Delhi 110063, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijitcs.2025.06.04

Received: 21 Jul. 2025 / Revised: 24 Sep. 2025 / Accepted: 2 Nov. 2025 / Published: 8 Dec. 2025

Index Terms

Stress Detection, Anxiety Prediction, Natural Language Processing, Social Media, Machine Learning, Long Short-term Memory, Real-time Monitoring

Abstract

Stress and anxiety are some of the most public mental health illnesses that people in the current society face. It is important to determine these conditions early to be able to effectively promote the well-being of individuals. This research work presents the possibility of identifying stress and anxiety through social media (SM) data and an anonymous survey, by machine learning (ML) and natural language processing (NLP). The paper starts with data collection, using the DASS-21 questionnaire and a sample of tweets obtained from Twitter users from India, aimed at determining which language is associated with stress and anxiety. The gathered data is pre-processed in some of the steps, such as URL removal, lower casing, punctuation removal, stop words removal, and lemmatization. After data preprocessing, the textual content is transformed into numerical form through Word2Vec to facilitate pattern analysis. To enrich the analysis of the main topics in the dataset, the Latent Dirichlet Allocation (LDA) and the Non-Negative Matrix Factorization (NMF) techniques are applied. For the classification, the work uses ML algorithms such as Support Vector Machine (SVM), Random Forest (RF), and Long Short-Term Memory (LSTM) networks. Lastly, the project involves an application created with Streamlit to allow the user to interact with the model. 

Cite This Paper

Ravi Arora, S. V. A. V. Prasad, Arvind Rehalia, Nikhil Kaushik, Anil Kumar, "Early Detection of Stress and Anxiety Using NLP and Machine Learning on Social Media Data", International Journal of Information Technology and Computer Science(IJITCS), Vol.17, No.6, pp.70-94, 2025. DOI:10.5815/ijitcs.2025.06.04

Reference

[1]X. Wang, J. Lin, Q. Liu, X. Lv, G. Wang, J. Wei,... and T. Si, “Major depressive disorder comorbid with general anxiety disorder: Associations among neuroticism, adult stress, and the inflammatory index,” Journal of psychiatric research, Vol. 148, pp. 307-314, 2022.
[2]S Muñoz, and C.A. Iglesias, “A text classification approach to detect psychological stress combining a lexicon-based feature framework with distributional representations,” Information Processing & Management. Vol. 59, No. 5, pp. 103011, 2022.
[3]S. H. Hamaideh, H. Al‐Modallal, M. A. Tanash, and A. Hamdan‐Mansour3, “Depression, anxiety, and stress among undergraduate students during the COVID‐19 outbreak and" home‐quarantine,” Nursing Open, Vol. 9, No. 2, pp.  1423-1431, 2022.
[4]N. Phutela, D. Relan, G. Gabrani, P. Kumaraguru, and M. Samuel, “Stress classification using brain signals based on LSTM network,” Computational Intelligence and Neuroscience, Vol. 2022, No. 1, pp.  7607592, 2022.
[5]X. Wan, and L. Tian, “User Stress Detection Using Social Media Text: A Novel Machine Learning Approach,” International Journal of Computers Communications & Control, Vol. 19, No. 5, 2024.
[6]B. Obispo-Portero, P. Cruz-Castellanos, P. Jiménez-Fonseca, J. Rogado, R. Hernandez, Castillo O. A. -Trujillo,... and C. Calderon, “Anxiety and depression in patients with advanced cancer during the COVID-19 pandemic,” Supportive Care in Cancer, Vol. 30, No. 4, pp. 3363-3370, 2022.
[7]A. Y. Mikhaylov, A. V. Yumashev, and E. Kolpak, “Quality of life, anxiety and depressive disorders in patients with extrasystolic arrhythmia,” Archives of Medical Science: AMS, Vol. 18, No. 2, pp. 328, 2020.
[8]M. K. Kabir, M. Islam, A. N. B. Kabir, A. Haque, and M. K. Rhaman, “Detection of depression severity using Bengali social media posts on mental health: study using natural language processing techniques,” JMIR Formative Research, Vol. 6, No. 9, pp.  e36118, 2022.
[9]S. Inamdar, R. Chapekar, S. Gite, and B. Pradhan, “Machine learning driven mental stress detection on reddit posts using natural language processing,” Human-Centric Intelligent Systems, Vol. 3, No. 2, pp.  80-91, 2023.
[10]Y. C. Yang, A. Xie, S. Kim, J. Hair, M. Al-Garadi, and A. Sarker, “Automatic detection of twitter users who express chronic stress experiences via supervised machine learning and natural language processing,” CIN: Computers, Informatics, Nursing, Vol. 41, No. 9, pp.  717-724, 2023.
[11]N. Oryngozha, P. Shamoi, and A. Igali, “Detection and analysis of stress-related posts in Reddit’s acamedic communities,” IEEE Access, Vol. 12, pp. 14932-14948, 2024.
[12]J. Zhu, Z. Zhang, Z. Guo, and Z. Li, “Sentiment Classification of Anxiety-Related Texts in Social Media via Fuzing Linguistic and Semantic Features,” IEEE Transactions on Computational Social Systems. 2024.
[13]W. Zhang, J. Xie, Z. Zhang, and X. Liu, “Depression detection using digital traces on social media: A knowledge-aware deep learning approach,” Journal of Management Information Systems, Vol. 41, No. 2, pp.  546-580, 2024. 
[14]W. R. D. Santos, R. L. de Oliveira, and I. Paraboni, “SetembroBR: a social media corpus for depression and anxiety disorder prediction,” Language Resources and Evaluation, Vol. 58, No. 1, pp. 273-300, 2024. 
[15]S. D. Pande, S. K. Hasane Ahammad, M. N. Gurav, O. S. Faragallah, M. M. Eid, and A. N. Z. Rashed, “Depression detection based on social networking sites using data mining,” Multimedia tools and applications, Vol. 83, No. 9, pp.  25951-25967, 2024.
[16]H. Mo, S. C. Hui, X. Liao, Y. Li, W. Zhang, and S. Ding, “A multimodal data-driven framework for anxiety screening,” IEEE Transactions on Instrumentation and Measurement, Vol. 73, pp.  1-13, 2024. 
[17]M. A. Abbas, K. Munir, A. Raza, N. A. Samee, M. M. Jamjoom, and Z. Ullah, “Novel transformer based contextualized embedding and probabilistic features for depression detection from social media,” IEEE Access. 2024.
[18]V. Tejaswini, K. Sathya Babu, and B. Sahoo, “Depression detection from social media text analysis using natural language processing techniques and hybrid deep learning model,” ACM Transactions on Asian and Low-Resource Language Information Processing, Vol. 23, No. 1, pp.  1-20, 2024.
[19]K. Yang, T. Zhang, and S. Ananiadou, “A mental state Knowledge–aware and Contrastive Network for early stress and depression detection on social media,” Information Processing & Management, Vol. 59, No. 4, pp.  102961, 2022.
[20]https://www.healthfocuspsychology.com.au/tools/dass-21/
[21]https://www.kaggle.com/datasets/hyunkic/twitter-depression-dataset
[22]C. Sharma, I. Batra, S. Sharma, A. Malik, A. S. Hosen, and I. H. Ra, “Predicting trends and research patterns of smart cities: A semi-automatic review using latent dirichlet allocation (LDA),” IEEE Access, Vol. 10, pp.  121080-121095, 2022.
[23]T. Aonishi, R. Maruyama, T. Ito, H. Miyakawa, M. Murayama, and K. Ota, “Imaging data analysis using non-negative matrix factorization,” Neuroscience Research, Vol. 179, pp. 51-56, 2022.
[24]P. Garg, J. Santhosh, A. Dengel, and S. Ishimaru, “Stress detection by machine learning and wearable sensors,” In Companion Proceedings of the 26th International Conference on Intelligent User Interfaces, pp. 43-45, 2021. 
[25]M. Abd Al-Alim, R. Mubarak, N. M. Salem, and I. Sadek, “A machine-learning approach for stress detection using wearable sensors in free-living environments,” Computers in Biology and Medicine, Vol. 179, pp. 108918, 2024.
[26]S. D. Sharma, S. Sharma, R. Singh, A. Gehlot, N. Priyadarshi, and B. Twala, “Deep recurrent neural network assisted stress detection system for working professionals,” Applied Sciences, Vol. 12, No. 17, pp.  8678, 2022.
[27]L. Mou, C. Zhou, P. Zhao, B. Nakisa, M. N. Rastgoo, R. Jain, and W. Gao, “Driver stress detection via multimodal fusion using attention-based CNN-LSTM,” Expert Systems with Applications, Vol. 173, pp. 114693, 2021.