Open Access Highly Accessed Regular article

Word usage mirrors community structure in the online social network Twitter

John Bryden1, Sebastian Funk23* and Vincent AA Jansen1

Author Affiliations

1 School of Biological Sciences, Royal Holloway, University of London, Egham, TW20 0EX, UK

2 Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, 08544, USA

3 London School of Hygiene & Tropical Medicine, Keppel Street, London, WC1E 7HT, UK

For all author emails, please log on.

EPJ Data Science 2013, 2:3  doi:10.1140/epjds15

Published: 25 February 2013

Abstract

Background

Language has functions that transcend the transmission of information and varies with social context. To find out how language and social network structure interlink, we studied communication on Twitter, a broadly-used online messaging service.

Results

We show that the network emerging from user communication can be structured into a hierarchy of communities, and that the frequencies of words used within those communities closely replicate this pattern. Consequently, communities can be characterised by their most significantly used words. The words used by an individual user, in turn, can be used to predict the community of which that user is a member.

Conclusions

This indicates a relationship between human language and social networks, and suggests that the study of online communication offers vast potential for understanding the fabric of human society. Our approach can be used for enriching community detection with word analysis, which provides the ability to automate the classification of communities in social networks and identify emerging social groups.