Word usage mirrors community structure in the online social network Twitter
1 School of Biological Sciences, Royal Holloway, University of London, Egham, TW20 0EX, UK
2 Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, 08544, USA
3 London School of Hygiene & Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
EPJ Data Science 2013, 2:3 doi:10.1140/epjds15Published: 25 February 2013
Language has functions that transcend the transmission of information and varies with social context. To find out how language and social network structure interlink, we studied communication on Twitter, a broadly-used online messaging service.
We show that the network emerging from user communication can be structured into a hierarchy of communities, and that the frequencies of words used within those communities closely replicate this pattern. Consequently, communities can be characterised by their most significantly used words. The words used by an individual user, in turn, can be used to predict the community of which that user is a member.
This indicates a relationship between human language and social networks, and suggests that the study of online communication offers vast potential for understanding the fabric of human society. Our approach can be used for enriching community detection with word analysis, which provides the ability to automate the classification of communities in social networks and identify emerging social groups.