Reading in translation is an impoverishment, not so much because of the fluctuating quality of a translation or the loss of a perceived 'original work' but because of the elision of sociolinguistic context and the difficulty in conveying that lost context to readers. That world literature is taught almost always in translation at universities in America and elsewhere (a situation driven by the low capacity for foreign language instruction at the university level and the broad linguistic reach of world literature courses) compounds this problem. Of the estimated 20.2m American undergraduates in 2015, an estimated 1.6m are enrolled in foreign language courses, and, historically, only 17% of those reach the higher levels of proficiency necessary to read literature (Goldberg et al., 2013). The problem of American monolingualism has even affected the publishing industry; as noted translation theorist Laurence Venuti argued (1998), there are more translations of English texts into other languages than ever before while fewer foreign texts are translated into English. This trend means that American students are even less likely to feel comfortable with literature in translation than students from non-English speaking countries.
However, conversations about literary summarization, a core practice of critical reading, take place globally and are preserved in the background of pages on every national language Wikipedia. As each language community summarizes their translations of works of world literature in the form of Wikipedia articles, they generate for every crowdsourced entry a discussion page and a history of that page; together, they reveal that literary work's history of reception for a reading community in a given language. By striving to synthesize an authoritative, peer-reviewed summary of text native to or translated into that community's language, each group highlights their concerns, thought processes, and the challenges posed by a given work. The goal is to make these differences and conversations more visible to readers through techniques from natural language processing for automatic translation, topic modeling, and the visualization of topic models; in so doing, we aim to develop a method by which the digital humanities can address a core problem in comparative and world literature.
Machine translated Wikipedia discussions can help reveal to monolingual audiences the degree to which cultural pragmatics influence the reception of key (and popular) works of world literature. Although flawed, automatic translation has been found in some languages to be comparable with human translation, at least in regards to cohesion and formality (Li et al., 2014). Presenting parallel national-language conversations – such as the national language conversations about topics like translating the title for Camus' L'Étranger / The Stranger / Der Fremde/ The Foreigner and a visualization of the change-over-time of Goethe's Faust I summarizations in English, German, and Spanish – would help demonstrate the practical reception of a work in a language community. While work has been done on multi-lingual topic models (Ni et al., 2009), our research assumes that there will be both alignment and misalignment of topics across the various languages of a work; as such, our project resists the urge to normalize those topics into one category on the basis of an imperfect vector model of semantic similarity. Furthermore, experiments on iterative summarizations, such as elements of The Tale of Genji, demonstrate how even basic tasks in literary scholarship contain cultural dimensions and can thus reveal strong cultural patterns and biases held by different populations (Kashima, 2000). Along with the works mentioned above, this research explores the Wikipedia conversations around J.D. Salinger's Catcher in the Rye / Der Fänger im Roggen / El guardián entre el centeno o El cazador oculto / L'Attrape-cœurs / Il giovane Holden and Homer/Omero's The Odyssey / Die Odyssee / Odisea / L'Odyssée / Odissea across English, German, Spanish, French, and Italian Wikipedia pages.
In an increasingly digitized cultural landscape, the pedagogical use of Wikipedia and similar platforms has gained a great deal of traction in certain fields. While it is often derided as a dubious source for information, Wikipedia has proven to be a successful arena for instructing students on the distillation of information, writing in the public sphere, and collaborative writing (Purdy, 2009; Vetter, 2014; Sweeney, 2012). As a multilingual space, Wikipedia has offered many scholars the opportunity to examine the ways in which the cross-cultural sharing of information takes place (Nothman et al, 2013; Filatova, 2012). In some instances, Wikipedia specifically has been used as a place for the comparison of knowledge and representation across cultural and geographic divides (Callahan and Herring, 2011). These scholars provide a framework for examining the cross-linguistic aspects of Wikipedia in order to highlight cultural differences and deconstruct colonial power structures that privilege the English language (Ensslin, 2011). In the available scholarship, the use of Wikipedia in teaching literature has been largely ignored; only a few studies exist and those mainly address literary studies' reticence to incorporate Wikis into pedagogy (Bayliss, 2013). However, quite a few studies suggest that Wikipedia can occupy a distinct operational space in the university that supplements established pedagogical spaces and practices (Gorard and Selwyn, 2015; Knight and Pyke, 2012).
Our computational framework follows these possibilities to supplement two established methods of teaching translation in a world literature classroom. The first method consists of placing a work of literature in translation – a passage from a novel, for example – in relation to the same work in its original language. The hope is that students can analyze the differences between the original and the work in translation and, thereby, understand the way the literary text undergoes a 'new life' in translation. This method not only assumes a sophisticated knowledge of foreign languages among the majority of students but also brackets the question of how these works are read in the original language by native-language readers. The second method juxtaposes several translations of the same work into English. This has the advantage of not assuming knowledge of a foreign language. For example, students might read various translations of the 19th-century French poet Baudelaire in English, starting from English translations from the late 19th century and continuing through translations that have been published in the last twenty years. The advantage of this method is clear: students can easily see, in their own language, the way translation can change the meaning of a poem as they read it from one translation to another. While this method successfully bypasses the problem of foreign language competency, it amplifies the second problem: the student is even more divorced from the source-language context since all emphasis is placed on the way English speakers respond to a work of literature.
New methods, therefore, are needed to expose foreign texts and foreign contexts. Using computational methods to enter the themes and arguments about specific texts in other languages precludes some of the need for high-level competency in a foreign language in order to understand the debates about world literature in a non-American – and especially non-English – context. Automatic translation and topic modeling facilitate encounters with Wikipedia 'talk' pages in other languages. The rough output of these automated and annotated procedures preserves some of the estrangement of working across languages as the translation is rough, clearly communicating its nature as translation — and therefore serves its purpose without fully replacing the source text. With such experiments, our project broadly addresses two questions: how different language populations create summaries that are culturally distinct, and how these differences can be folded back into meaningful encounters for readers of works in translation. The overall method for this project is to identify the subset of well-documented and significant works, mine the relevant Wikipedia entries and conversations, and develop preliminary code to identify the linguistic features of the entry (e.g. topics, use of modals, noun density, phraseological structures, complexity measures, etc.) and the nodal points of the crowd's conversation that yielded the page. As an example, consider the discrepancies across the Italian and English discussion pages of The Odyssey / Odissea as represented by ten 5- to 10-term topic models.
|Terms, Italian Discussion Page||Topic Name|
|odissea originali palla contributi registrarmi||Contributions and Registration|
|wikipedia forum migliorarla posto figlio arrivassero tenter tentare importanza||Need for More Expert Contributors|
|ulisse niente aggiunto dante manchi riferimento elenco procedo cielo parte pietose condizioni||References to Dante|
|poseidone divina generale voglia||Unclear|
|commedia incontra quesiti accecato ostacola competenti voce||Other Web Sources|
|ripristino aggiungerei siti porre partenza traducendo manna materia versione||Older Versions are Better than Newer Versions|
|dir notizie rete evidente polifemo||Unclear|
|inglese qualcuno pensa piccola inferno appositi pagina discutere film serve deciso||Using English Wikipedia to Backfill Italian|
|esattamente pecca opere fin tema passo potere utenti||Reflections on Contributors|
|pare so troia attendo basandomi mesi trova provvisorio||Reflections on What to Write|
|Terms, English Discussion Page||Topic Name|
|odyssey homer work titles guideline searching works section promotional directed||Editing Guidelines and Work Title|
|february page common dab crazynas people epic redirect iliad dictionary||Genre and Other Works|
|talk odysseus december journey extant account term proposal giu september||The Journey|
|word article wp called style subject similar adventures locations toronto||Locations in the Story and Writing Style|
|utc play topic primary part poem link note odisseus july||Poetry|
|title article refer word davidiad edited cynwolfe don titles voyage||Voyages|
|edit western preceding map request www apology talk zcc suggest||Mapping the Journey|
|medea akhilleus euripides noun don current literature click crazynas english||Characters|
|utc april added source staged meant argument written fall cite||Sourcing and Staging|
|talk comment unsigned university musical oldest review tedickey tomb point||Ceremony|
What comes across in this comparison is the somewhat different concerns of the two reading communities. The Italian discussion reflects concerns with the expertise of the editing community, a social reflection common to Wikipedias, and with connections between Homer and Dante, a figure more central to Italian literary identity. The English page reflects a concern with the Odyssean journey and its possible real-world correspondences. More commonality was found in another example comparing the discussion pages of J.D Salinger's The Catcher in the Rye. These alignments and misalignments of reader's concerns can destabilize the primacy of concerns held by a given reading community and speaks to one of the core benefits of reading world literature.