American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: American Psychiatric Publishing.
Book Google Scholar
Bagroy, S., Kumaraguru, P., & De Choudhury, M. (2017). A social media based index of mental well-being in college campuses. In Proceedings of the 2017 CHI Conference on Human factors in Computing Systems (pp. 1634–1646). New York, NY: ACM Press.
Chapter Google Scholar
Bedi, G., Carrillo, F., Cecchi, G., Slezak, D., Sigman, M., Mota, N., . . . Corcoran, C. M. (2015). Automated analysis of free speech predicts psychosis onset in high-risk youths. NPJ Schizophrenia, 1, 15030.
Bond, R., Fariss, C., Jones, J., Kramer, A., Marlow, C., Settle, J., & Fowler, J. (2012). A 61-million-person experiment in social influence and political mobilization. Nature, 489, 295–298.
Article PubMed Google Scholar
Coppersmith, G., Dredze, M., Harman, C., Hollingshead, K., & Mitchell, M. (2015). CLPsych 2015 shared task: Depression and PTSD on Twitter. In Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology (pp. 31–39). Red Hook, NY: Association for Computational Linguistics.
Google Scholar
Corcoran, C., Carrillo, F., Slezak, D., Klim, C., Bedi, G., Javitt, D., . . . Cecchi, G. (2018). Language disturbance as a predictor of psychosis onset in youth at enhanced clinical risk. Schizophrenia Bulletin, 44, S43–S44.
De Choudhury, M., Counts, S., Horvitz, E., & Hoff, A. (2014). Characterizing and predicting postpartum depression from shared Facebook data. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work and Social Computing (pp. 628–638). New York, NY: ACM Press.
Google Scholar
De Choudhury, M., Gamon, M., Counts, S., & Horvitz, E. (2013). Predicting depression via social media. In Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media (pp. 128–137). Menlo Park, CA: AAAI Press.
Google Scholar
De Choudhury, M., Kiciman, E., Dredze, M., Coppersmith, G., & Kumar, M. (2016). Discovering shifts to suicidal ideation from mental health content in social media. In Proceedings of the 2016 CHI conference on human factors in computing systems (pp. 2098–2110). New York, NY: ACM Press.
Chapter Google Scholar
Elvevag, B., Cohen, A., Wolters, M. , Whalley, H., Gountouna, V, Kuznetsova, K., . . . Nicodemus, K (2016). An examination of the language construct in NIMH’s research domain criteria: Time for reconceptualization! American Journal of Medical Genetics Part B, 171, 904–919.
Elvevag, B., Foltz, P., Weinberger, D., & Goldberg, T. (2007). Quantifying incoherence in speech: An automated methodology and novel application to schizophrenia. Schizophrenia Research, 93, 304–316.
Article PubMed Google Scholar
Ester, M., Kriegel, H., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In E. Simoudis, J. Han, & U. Fayyad (Eds.), Proceedings of Second International Conference on Knowledge Discovery and Data Mining (pp. 226–231). Menlo Park, CA: AAAI Press.
Google Scholar
Frankel, M. (2012). Regulating the boundaries of dual-use research. Science, 336(6088), 1523–1525.
Gkotsis, G., Oellrich, A., Velupillai, S., Liakata, M., Hubbard, T., Dobson, R., & Dutta, R. (2017). Characterisation of mental health conditions in social media using Informed Deep Learning. Nature Scientific Reports, 7, 45141.
Article Google Scholar
Goldstone, R., & Lupyan, G. (2016). Discovering psychological principles by mining naturally occurring datasets. Topics in Cognitive Science, 8, 548–568.
Article PubMed Google Scholar
Guntuku, S. C., Yaden, D. B., Kern, M. L., Ungar, L. H., & Eichstaedt, J. C. (2017). Detecting depression and mental illness on social media: An integrative review. Current Opinion in Behavioral Sciences, 18, 43–49. https://doi.org/10.1016/j.cobeha.2017.07.005
Article Google Scholar
Insel, T. (2017). Digital phenotyping: Technology for a new science of behavior. Journal of the American Medical Association, 318, 1215–1216.
Article PubMed Google Scholar
Ireland, M. E., & Mehl, M. R. (2014). Natural language use as a marker of personality. In T. M. Holtgraves (Ed.), Oxford handbook of language and social psychology (pp. 201–218). New York, NY: Oxford University Press. https://doi.org/10.1093/oxfordhb/9780199838639.013.034
Chapter Google Scholar
Jain, S., Powers, B., Hawkins, J., & Brownstein, J. (2015). The digital phenotype. Nature Biotechnology, 33, 462–463.
Article PubMed Google Scholar
Kapur, S., Phillips, A. G., & Insel, T. R. (2012). Why has it taken so long for biological psychiatry to develop clinical tests and what to do about it? Molecular Psychiatry, 17, 1174–1179. https://doi.org/10.1038/mp.2012.105
Article PubMed Google Scholar
Kern, M. L., Park, G., Eichstaedt, J., Schwartz, H., Sap, M., Smith, L, & Ungar, L. (2016). Gaining insights from social media language: Methodologies and challenges. Psychological Methods, 21, 507–525. https://doi.org/10.1037/met0000091
Article PubMed Google Scholar
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In NIPS’12 Proceedings of the 25th International Conference on Neural Information Processing Systems (pp. 1097–1105). Red Hook, NY: Curran Associates.
Google Scholar
Kosinski, M., Stillwell, D., & Graepel, T. (2013). Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences, 110(15), 5802–5805.
Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 2579–2605.
Google Scholar
Mehl, M., Pennebaker, J, Crow, D., Dabbs, J., & Price, J. (2001). The electronically activated recorder (EAR): A device for sampling naturalistic daily activities and conversations. Behavior Research Methods, Instruments, & Computers, 33, 517–523.
Article Google Scholar
Mikolov, T., Chen, K., Corrado, D., & Dean, J. (2013). Efficient estimation of word representations in vector space. In International Conference on Learning Representations (ICLR) 2013. Retrieved from https://sites.google.com/site/representationlearning2013/workshop-proceedings
Monroe, S. M., & Simons, A. D. (1991). Diathesis—Stress theories in the context of life stress research: Implications for the depressive disorders. Psychological Bulletin, 110, 406–425.
Article PubMed Google Scholar
Mota, N., Copelli, M., & Ribeiro, S. (2017). Thought disorder measured as random speech structure classifies negative symptoms and schizophrenia diagnosis 6 months in advance. NPJ Schizophrenia, 3, 18. https://doi.org/10.1038/s41537-017-0019-3
Article PubMed PubMed Central Google Scholar
Mota, N., Vasconcelos, N., Lemos, N., Pieretti, A., Kinouchi, O., Cecchi, G., . . . Ribeiro, S. (2012). Speech graphs provide a quantitative measure of thought disorder in psychosis. PLoS ONE, 7, e34928. https://doi.org/10.1371/journal.pone.0034928
Narayanan, A., & Shamitkov, V. (2008). Robust de-anonymizatoin of large sparse datasets. In Proceedings of IEEE 2008.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., . . . Vanderplas, J. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
Pennebaker, J., Boyd, R., Jordan, K., & Blackburn, K. (2015). The development and psychometric properties of LIWC2015. Retrieved from https://repositories.lib.utexas.edu/.
Pennebaker, J., & King, L. (1999). Linguistic style: Language use as an individual difference. Journal of Personality and Social Psychology, 77, 1296–1312.
Article PubMed Google Scholar
Pennebaker, J. W., & Graybeal, A. (2001). Patterns of natural language use: Disclosure, personality, and social integration. Current Directions in Psychological Science, 10, 90–93. https://doi.org/10.1111/1467-8721.00123
Article Google Scholar
Pennebaker, J. W., Mehl, M. R., & Niederhoffer, K. G. (2003). Psychological aspects of natural language. use: Our words, our selves. Annual Review of Psychology, 54, 547–577. https://doi.org/10.1146/annurev.psych.54.101601.145041
Article PubMed Google Scholar
Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on EMNLP (pp. 1532–1543). New York, NY: Association for Computational Linguistics.
Google Scholar
Preotiuc-Pietro, D., Eichstaedt, J., Park, G., Sap, M., Smith, L., Tobolsky, V., . . . Ungar, L. (2015). The role of personality, age and gender in tweeting about mental illness. In Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology (pp. 21–30). New York, NY: Association for Computational Linguistics.
Resnik, P., Armstrong, W., Claudino, L., Nguyne, T., Nguyen, V., & Boyd-Graber, J. (2015). Beyond LDA: Exploring supervised topic modeling for depression-related language in Twitter. In Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology (pp. 99–107). New York, NY: Association for Computational Linguistics.
Google Scholar
Rude, S., Gortner, E., & Pennebaker, J. (2004). Language use of depressed and depression-vulnerable college students. Cognition & Emotion, 18(8), 1121–113.
Schwartz, H. A., Eichstaedt, J., Kern, M. L., Park, G., Sap, M., Stillwell, D., . . . Ungar, L. (2014). Toward assessing changes in degree of depression through Facebook. In Proceedings of the Workshop on Computational Linguistics and Clinical Psychology (pp. 118–125). New York, NY: Association for Computational Linguistics.
Silver, D., Huang, A., Maddison, C., Guez, A., Sifre, L., Van Den Driessche, G., & Hassabis, D. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529, 484–489.
Article PubMed Google Scholar
Thorstad, R., & Wolff, P. (2018). A big data analysis of the relationship between future thinking and decision-making. Proceedings of the National Academy of Sciences, 115, 1740–1748.
Article Google Scholar
Wolinetz, C. (2012). Implementing the new US dual-use policy. Science, 336(6088), 1525–1527.
Youyou, W., Kosinski, M., & Stillwell, D. (2015). Computer-based personality judgments are more accurate than those made by humans. Proceedings of the National Academy of Sciences, 112, 1036–1040.
Article Google Scholar
Page 2
From: Predicting future mental illness from social media: A big-data approach
ADHD
Anxiety
Bipolar
Depression
Total
Subreddits
Mental illness subreddits (Study 1)
.83
.75
.75
.74
.77
All other subreddits (Study 2)
.42
.30
.34
.44
.38
All other subreddits, future prediction (Study 3)
.39
.32
.37
.36
.36
Performance is reported as F score, both separately for each disorder and averaged over all disorders. Chance performance is .25. Performance is scored on held-out test data