Sentiment analysis and opinion mining methods are established for automatically summarizing information shared by users in product reviews or in social media platforms like Twitter, Facebook or more specific fora (Liu 2015). These approaches can be categorized into coarse-grained and fine-grained methods: The first focus on assigning a polarity (positive, negative, neutral) and optionally an intensity to a text snippet (Täckström and McDonald 2011; Pang and Lee 2004). The latter additionally aim at detecting the opinion holder (for instance a specific person mentioned in a news article) and the target (for instance a specific aspect of a product in a review) (Hu and Liu 2004; Popescu and Etzioni 2005; Jakob and Gurevych 2010).
Transferring such methods to the analysis of literature leads to at least two questions: Firstly, are polarities for this domain as helpful as for the analysis of reviews? Secondly, how can such methods from sentiment analysis be improved, and what can they contribute to literature analysis?
Regarding the first aspect, resources to measure the occurrence of words which are associated with different emotions have been developed for English but, to the best of our knowledge, not for German (Mohammad et al. 2015). Secondly, it should be noted that research in German sentiment analysis is still comparably limited (counter examples are Ruppenhofer et al. 2014; Klinger and Cimiano 2015; Remus, Quasthoff, and Heyer 2010). In addition, sentiment analysis has mainly focused on the Web, like social media, and product reviews. However, the analysis of emotions and sentiment in literature has been proven to be of interest and value (Mellmann 2007; Winko 2003). A prerequisite for a quantitative approach is that emotions are (at least to some extend) a surface phenomenon (Hillebrandt 2011, p. 154), i.e., that words carry information such that it is possible to infer “private states” of specific emotions (Wiebe, Wilson, and Cardie 2005).
Our two main contributions are: (a) We make German dictionaries of words associated with seven fundamental emotions publicly available, and (b) perform a case study on Kafka’s “Amerika” and “Das Schloss” regarding emotion analysis to support literature studies with a focus on complex non-factual phenomena and the analysis of personality traits. All resources and software used in this paper are made publicly available at http://www.romanklinger.de/emotion/.
The goal of this work is to detect different emotions represented in literary texts. Psychological research offers different models to categorize emotions. The most common ones include Plutchik’s wheel of emotions (Plutchik 2001) and Ekman’s definition of fundamental emotions (Ekman 1999). A discussion of relevant context is offered by Russell (Russell 1991). We opt for roughly following the structure of Ekman’s definition of emotions and focus on anger (Wut), disgust (Ekel), fear (Angst), enjoyment (Glück), sadness (Trauer), and surprise (Überraschung) and contempt (Verachtung).
To track emotions over the whole text, we assign an emotion score es( e, t ab) to a subset of consecutive tokens t ab from textual position a to position b as
where D e is a dictionary containing words expressing the specific emotion e and 1 t ∈ D is 1 if and only if t i∈ D e and 0 otherwise. This score corresponds to the number of tokens which are in a window and in the respective emotion dictionary, normalized by the dictionary size.
To track the development of the emotions over the whole text, we apply a sliding window approach which is parameterized by window size w such that b = a + w − 1 (which can be interpreted as a smoothing parameter). To allow for a character oriented analysis, we assign an emotion score as in the sliding window, but for windows around each mention of such character in the text, with an additional normalization based on number of character mentions. Each token and dictionary entry is normalized by mapping to lower-case and stemming with the Porter stemmer (Porter 2001).
As a resource for the emotion dictionaries, two authors of this paper manually selected words from different sentiment polarity, subjectivity, and emotion resources in German and English (translated to German) into the emotion categories (Waltinger 2010a; Waltinger 2010b; Remus, Quasthoff, and Heyer 2010; Mohammad and Turney 2013). We semi-automatically enriched this resource with synonyms (Naber 2005; Wermke, Kunkel-Razum, and Scholze-Stubenrecht 2010).
As an estimate for the difficulty of emotion assignments, we performed an annotation experiment of 300 words (stratified sample from all emotions in the dictionary mentioned above) with fluent speakers of German. In 85 % of all words two out of three annotators agree on the same emotion, however, only in 46 % of of all words, three annotators agree on the associated emotion.
As a use-case, we apply our methods to Franz Kafka’s “Der Verschollene” (“Amerika”) and “Das Schloß”. Especially the latter is interesting as a comprehensive emotion-focused manual analysis is available (Hillebrandt 2011). It is narrated in third person and interesting from an emotion analysis point of view, as attribution of specific emotions to the protagonist is difficult (Hillebrandt 2011, p. 165).
The development of emotions in Figures 1 and 2 visualize the outcomes of our analyzes. In “Das Schloss”, the strong increase of surprise towards the end is striking (most indicative words are “neu”, “schnell”, “plötzlich”, “ungeduldig”). Another example for an eye-catching peak of fear is shortly after start of chapter 3 (“ängstlich”, “Gefahr”, “unruhig”, “Gewalt”). In “Amerika”, one striking characteristic is the decrease of enjoyment after a peak in chapter 4 (“gut”, “Mutter”, “glücklich”) followed by disgust in chapter 5 (“unerträglich”, “Elend”, “schrecklich”, “beschmutzt”). Emotions for each mention of a selection of characters in “Amerika” and “Das Schloss” are shown in Figures 3 and 4.
This project has been partially funded by the project CRETA (Zentrum für reflektierte Textanalyse, http://www.creta.uni-stuttgart.de/), funded by the German Ministry of Education and Research.