Master thesis defense by Emma Høst Fowler


Exploring suicide related language on Twitter - An investigation of suicide related language prior to and following a self-stated suicide attempt


The present study explored the contents of posts written by a set of Twitter users (n=47) and the extent to which their writings exhibited social engagement and self-focus, prior to and following a self-stated suicide attempt.The current study examined 1) the content of tweets posted by users, as well as 2) users’ level of social engagement as measured by both the quantity and context of their pronoun use. These two aspects of the collected data were examined using 1) the tf-idf weighting scheme, 2) Latent Dirichlet Allocation (LDA) and 3) Doc2Vec.

A tf-idf analysis of n-grams occurring in tweets posted prior to and following a stated suicide attempt, indicated that highly scored n-grams did not differ much in meaning before and after the attempt. Users posted about mental health issues during both time periods and used many n-grams referring to cognitive processes. N-grams concerned with the desire to die were found in the time leading up to the attempt, while n-grams associated with the attempted suicide were found in tweets posted in the time succeeding.

To further examine the contents of the collected data, LDA was used to investigate shifts in topics users engage in preceding and succeeding a suicide attempt. the topic shift analysis showed an increase in topics related to hope, life, change, negative feelings, homelessness, other people, and words associated with the suicide attempt such as (’hospital’), in tweets posted following the attempt. A decrease in a topic mainly concerned with school and work was found in tweets after the stated attempt.

Lastly, users’ social engagement and self-focus was investigated by measuring the use of first person singular-, first person plural- and second- and third- person pronouns (signalling self-focus, collective attention and social interest). Differences in the use of these different pronoun categories was tested for using paired t-tests. A significant decrease was found in first person singular pronouns from tweets posted prior to the stated attempt to tweets posted following, perhaps suggesting a height- ened self-focus in the time leading up to an attempted suicide. No significant differences were found in first person plural-, second- and third person pronoun use over time, indicating no differences in social engagement or interest in the time preceding or following the stated suicide attempt.

To investigate the context of pronouns, spatial similarity of first person singular-, first person plural-, second- and third person pronouns was computed using document- and word embeddings created in Doc2Vec. Words most similar to the different pronouns in tweets posted preceeding the suicide attempt were related to anger, negative feelings, cognitive processes, everyday items, longing and the desire to die. The most similar words found following the stated attempt were comparable in meaning, possibly suggesting that feelings of anger and the desire to kill oneself persists in the time following an attempted suicide.

Supervisor: Dirk Hovy
External Examiner: Dan Witzner Hansen, ITU