Exploring Automated Thematic Analysis of YouTube Comments Relevant to Social Support in Postpartum Depression Using Machine Learning Techniques

Abstract

The primary purpose of this study is to explore the feasibility of using Natural Language Processing (NLP) to automate the thematic analysis of YouTube comments posted by viewers relevant to social support in postpartum depression. We systematically collected and manually analyzed 7,243 YouTube comments on postpartum depression from January 2022 to June 2023. We compared human and NLP-generated themes using five supervised machine learning algorithms for accuracy, precision, recall, F1-score, and the ability to balance minority and majority class representation. The dataset was split into an 80/20 ratio respectively for training and testing purposes, optimized through hyperparameter tuning and 10-fold cross-validation. Each algorithm was tested in various scenarios: with class weighting, with SMOTE (Synthetic Minority Over-Sampling Technique), and without either. The Logistic Regression algorithms with class weight balancing demonstrated the best overall performance, achieving an accuracy of 0.69 and a fair balance in handling minority classes. Although the Support Vector Machine algorithms with class weight balancing also achieved an accuracy of 0.69, it was comparatively less effective in representing minority classes making it the second-best choice. Social media can provide valuable insights into social support; however, the labour-intensive nature of this work creates barriers to utilizing this data. The findings explore the potential of using NLP in automating the thematic analysis of social support in other health-related conditions and provide a comparison of various machine learning algorithms that can best support such analysis.

Presenters

Anila Virani
Assistant Professor, Nursing, Thompson Rivers University, British Columbia, Canada

Piper Jackson
Assistant Professor, Computing Science, Thompson Rivers University, British Columbia, Canada

Ahsan Mollani
Student, Thompson Rivers University, British Columbia, Canada

Details

Presentation Type

Paper Presentation in a Themed Session

Theme

2024 Special Focus—People, Education, and Technology for a Sustainable Future

KEYWORDS

Machine Learning, Natural Language Processing and Thematic Analysis