'Notoriously Toxic' presents a preliminary study of the language and impact of hate speech in the chat systems of online games. Developed by a group of researchers in game studies, computational linguistics, sociolinguistics, and law and guided by an overall tripartite feedback model broadly corresponding to shielding potential victims from harm, educating those who casually engage in hate speech, and censuring those who persist in abusing their fellow players, the hope is that research-driven technical and social interventions might slowly shift online discourse norms away from casual, vicious, and potentially dangerous speech. Identification at scale of textual expressions of toxic behavior in online environments is a necessary, empirical preliminary aspect of this work to understand the prevalence and cost of online hate, as is qualitative cultural studies of the games and their player populations. A recent example of qualitative framing work in this area was the 'Mapping Study on Projects Against Hate Speech Online' released in 2012 by the British Institute of Human Rights for the Council of Europe project, Young People Combating Hate Speech in Cyberspace (The British Institute of Human Rights Council, 2012). That report provides terminology and an environmental scan of processes aimed to limit hate speech online and offers suggestions as to new procedures. It, along with an examination of the reporting systems implemented across a host of online games, computational modeling of the language prevalent in these chat systems, and a study of work in the political sphere to defuse hate speech prior to its catalyzation of violence, serve as the foundation for this research.
Recent inquiries into the toxic elements of gaming cultures have primarily focused on communication outside of a game environment. For example, critical discourse analyses of player posts to online gaming forums found that heteronormative undertones of the World of Warcraft player community creates a culture of hostility toward LGBTQ communities (Pulos, 2013) and the same forum's adamant disavowal of feminism have made community conversations about gender roles and/or equality all but non-existent (Braithwaite, 2013). Similarly, Gray's (2012a; 2012b; 2012c) ethnography of Xbox Live demonstrates the constant barrage of gender and racially motivated harassment faced by women of color who opt to communicate with teammates via voice chat. Finally, community leaders' adamant position of gender based harassment being a 'non-issue' is summarized by Salter and Blodgett (2012), whose case study of Penny Arcade's (a popular webcomic and organizers of PAX, a successful annual gaming convention) dismissal of its responsibility in perpetuating rape culture and SXSW Interactive's recent declaration that conversations about harassment in the games space can by definition not be civil (Sinders, 2015) is indicative of an industry that is highly resistant to change unless external pressure is applied. Taken together, this scholarship is evidence that toxicity exists across gaming culture writ large, and is not isolated to a particular game or specific player community.
Studying this phenomena at webscale and in the ephemeral environments of multilingual online chat systems is complex and requires a multidisciplinary approach bridging core strengths in the humanities, such as cultural criticism, with strengths in social psychology, the data sciences, and linguistics. Studying the socially destructive behavior as manifested in online gaming platforms encourages innovative approaches to this problem. One corpus examined as part of this research is comprised of the chat logs produced by the player base in Riot Games' League of Legends (League). As of January 2014, League had ~27 million unique players every day each playing no less than 20 minutes and a peak concurrency of 7.5 million people who collectively have logged billions of hours of total play time for the game since 2009 (Sherr, 2014). Given that the game is a global phenomenon, the chat logs contain harassment in virtually every language.
Based on the UN framework provided in the International Covenant on Civil and Political Rights (1976), Susan Benesch generally defines hate speech as '. . . an expression that denigrates or stigmatizes a person or people based on their membership of a group that is usually but not always immutable, such as an ethnic or religious group. . . . Speech may express or foment hatred on the basis of any defining feature of a minority or indigenous people, such as ethnicity or religion – and can also denigrate people for another “failing“, such as their gender or even their location, as in the case of migrants' (Benesch, 2014, pp. 20). This broad but inclusive definition is further elaborated upon by Nazila Ghanea in reference to the International Convention on the Elimination of All Forms of Racial Discrimination (ICERD) in the establishment of a spectrum from least to greatest: discriminatory speech, hate speech, incitement to hatred, incitement to terrorism, and incitement to genocide (Ghanea, 2013, pp. 940-1). The characteristics of these definitions reflect the significant impact that hate speech acts have on the establishment and enforcement of personal and communal identity and the need to identify such acts in order to preserve those identities. In his discussion on Carey's ritual model of communication as applied to cases of hate speech, Clay Calvert explains how hate speech initializes and perpetuates the subordination of one group over another (Calvert, 1997). Calvert notes that hate speech acts, specifically focusing on the repeated utilization of racial epithets, construct reality in the speaker, audience, and target members of the discourse through the creation and maintenance of mental schemata similar to the functions of other speech acts: 'In particular, racist speech helps to define who minorities are and how others think about minorities, facilitating their unequal treatment' (Calvert, 1997, pp. 12). This construction is harmful in its immediacy to the target as well as in the long-term situation as it perpetuates unequal power structures based on criteria of identity (Calvert, 1997, pp. 15-16). The construction of reality based on hate speech acts is also relevant to a discussion of online environments as the textual communication serves as a large social aspect of both on- and offline environments.
Toxicity consists of verbal expressions and behaviors that serve to destabilize groups. It is unclear whether toxic behavior is directed to elicit particular responses, and hence systemic, or reactionary, emotional, and hence, situational. Frequently, the term 'troll,' or 'trolling,' is used synonymously with toxicity. Regardless, these behaviors are necessary to address because they are a key factor in the outright hostility of online gaming environments to those perceived as other. This destabilization maps on to offline models of gender, race, class, ethnic, national, linguistic, and abelist-based hate speech. A fertile ground for this analysis is in the virtual worlds of online gaming. For example, Kou and Nardi (2013), in their research on League of Legends, found that antisocial behavior destabilizes online communities but is addressable by social code and regulatory systems.
The targeted examples in online gaming environments show a concentrated sample of toxic behavior that is pervasive in every online environment. Studies have documented antisocial behavior and toxic speech in most digital platforms (O'Sullivan & Flanagin, 2003). Whether considering flaming on 1980s and 90s USENET forums, social media fueled outrage on contemporary politics, or in-game, text-based conversation in multiplayer games, toxicity can be motivated by emotional, intellectual, political, or other causes and as such correlates strongly with the modes and consequences of offline hate speech. Understanding online toxic behavior in ways that allow for moderation of its causes and effects first requires an understanding of how to study the concrete manifestation of this behavior—the text produced by users of a system. The challenges faced by this research are many: the writing styles are heavily infused with jargon, the orthography is non-standard, the chat stream only represents one channel of communication, and the communities are fluid.
In response to these challenges, an approach grounded in machine learning and NLP was tested. Using a small subset of the available data, a classifier based upon developed to separate players toxic from non-toxic players yielded a precision of 0.77, a recall of 0.79 and an F-score of 0.78. These results are encouraging, and along with training on a larger data set, secondary factors such as player avatar gender, length of match, and others were also preliminarily tested and found to have small influencing effects. These results suggest that there are concrete, detectable semantic and syntactic patterns in the harassment levied at players in these games. Connecting these findings to mechanisms for shielding, reforming, and censuring players, and to frameworks for understanding the social and psychological costs of being effectively locked in a room with one or more individuals determined to verbally abuse a peer is the more complex task of the cultural and ethnographic studies of digital communities.