Using Data Analysis and Machine Learning for Studying and Predicting Depression in Users on Social Media

It appears your Web browser is not configured to display PDF files. Download adobe Acrobat or click here to download the PDF file.

Click here to download the PDF file.


Singh, Chanpreet




Mental health problems leading to depression have become a critical concern due to the towering engagement of people on social media platforms. Several past approaches have been implemented by analyzing the pattern, behaviour, and vocabulary of the posts by users on social networking sites. This research proposed a system to predict users who could have been affected by depression, by introspecting characteristics of users already being affected. A combination of both the tweet-level and the user-level architecture was used to generate a more robust and reliable system where semantic embeddings trained from advanced neural networks were adopted under the tweet-level, whilst for the user-level, an approach using 12 significant features was operated by extensive feature engineering. Further, SVM with Word2Vec and TFIDF under tweet-level yielded an accuracy of 98.14% and recall of 95.63%, whereas the gradient boosting classifier under user-level revealed an accuracy of 95.26% with a recall of 86.75%.


Digital media




Carleton University

Thesis Degree Name: 

Master of Information Technology: 

Thesis Degree Level: 


Thesis Degree Discipline: 

Digital Media

Parent Collection: 

Theses and Dissertations

Items in CURVE are protected by copyright, with all rights reserved, unless otherwise indicated. They are made available with permission from the author(s).