While early English drama has been subject to several DH investigations, because “a quite explicit and rigid system of metadata is part of the genre itself” (Mueller, 2014: par27), French theatre of the 17th and 18th century remains largely underexplored, in spite of demonstrations and case studies presented by authors with large international audiences (Moretti, 2005). This paper presents some results of a project inspired by the explorations of the English corpora, but dedicated to the computer aided identification and analysis of theatrical topoi in the French domain.
Topoi can be defined as “recurrent configurations of thematic or formal elements” (adapted from Weill and Rodriguez, SATOR, 1996) and they are highly significant for the French plays, which are largely based on a common stock of characters, scenes and themes, and in which novelty often consists in reuse with a shift of former materials. Considering the vast amount of plays produced over the period going from, roughly, 1630 to 1790 (some 12000 plays, even if the text of many of them is lost), the use of computational tools to track the presence of various topoi over the time appears appropriate. Computer aid can help answer in a (more) rigorous way questions concerning the aesthetics of the French theatre of the period, too often reduced to “classicism”, and historical interrogations about moments and reasons when certain meaning units are more in use or relatively abandoned.
While the results of this endeavour are at a very early phase, for reasons to be detailed in the presentation, the very attempt to clarify some of these literary questions shows that the use of the computer is not neutral and modifies the problems to be solved. After a more detailed presentation of the activities undertaken to date, this paper will reflect upon the way in which the methods and tools used to approach the texts modify the initial research questions.
Dramacode: an expanding community for French drama
Identifying topoi using a computer asks in the first place for the existence of digitised texts in a more elaborate format than the pdfs or plain texts offered by Gallica (see http://gallica.bnf.fr). Various initiatives have been taken in this domain in France, either by scholars supported by HEIs and research institutes (see “Molière project” http://obvil.paris-sorbonne.fr/corpus/moliere/moliere) or by individuals with computer skills and a special interest in literature from the 17th and 18th century (see http://www.theatre-classique.fr). Unfortunately, this diversity of initiatives resulted in a great variety of encoding practices, more or less TEI compliant. To reduce this diversity and move towards common standards, an organisation has been created on GitHub. Under the name of “Dramacode”, it hosts some 882 plays written or staged between 1630 and 1810 (roughly, 7,3% of the dramatic production of the period), to be progressively re-tagged and enriched with further mark-up. “Dramacode” allows also to publish the contents with a harmonised style-sheet, and to share work on experiments of visualisation of data extracted from the plays.
Finding theatrical topoi through linguistic analysis and literary mark-up
Within this corpus, the plays of Louis de Boissy (1694-1754) have been selected for a pilot study, destined to build strategies for gathering sufficient clues, through various queries, about the existence of a topos in an unknown (i. e., not previously read) play, digitised and encoded in another project.
Louis de Boissy was a French play writer who enjoyed a certain success in the 18th century mainly because of two of his plays, Le Français à Londres and L’Homme du jour. Beyond these achievements, what renders his work interesting is precisely its mediocrity. Boissy’s works are a perfect example of the reuse of topoi, sometimes to follow the fashion, sometimes in an attempt to distinguish himself by revitalising conventional themes and dialogues. He has also the particularity of having worked for the three main theatrical spaces of the period, namely the Comédie-Française, Théâtre Italien and Théâtres de la Foire. Therefore, his texts become a privileged observatory of French comedy in the first half of the 18th century.
Two approaches of topoï identification are developed. The first one consists in developing an ontology for characters and scenes description. Roles and description of roles from the existing plays have been extracted, cleaned, and analysed using an excel spread-sheet. Ten categories appear necessary for describing characters from classical French scenes; five of them were known and formalised since Aristotle, or, at least, since the rediscovery of Aristotle by French authors and thinkers of Renaissance, then 17th century: nationality (“climat” in Marmontel’s words, see entry “Moeurs” in Eléments de littérature, 1787), age, sex, occupation (“état”) and temperament. Five others were deemed of interest, as allowing potential automatic analysis aimed to spot changes in aesthetic principles: the ontological category (human/ nonhuman/ semihuman), the dramatic or actantial role, the social status (“king”, “prince”, “marquis”, “Dom”…: it seemed useful to distinguish these from information about “état”), the family position, and the collective nature of some characters.
In parallel, types of scenes and other significant recurrences are manually identified and marked-up as “recognition”, “fight”, “complot”, etc. Relevant lines or phrases (labelled with xml:ids) are regrouped as <span>s in the <back> section of <text>; if each span receives, for the moment, a literal description, this will progressively feed into another library and will be later referred to using @ana.
The second approach is based on a linguistic analysis of the speeches, using AntConc to generate concordances and to identify key-words. The initial objective was to spot key-words or linguistic idiosyncrasies of the character of “petit-maître”, whose identification was the primary goal of this research. Indeed, “petit-maître” are not always labelled as such, and therefore their presence in French drama has been underestimated and incorrectly related to a specific period in time (mainly, “decadence comedy” after 1680). If, in time, a complete mark-up of the characters according to the above-mentioned categories (i. e., <temper>) will help identifying as “petits-maîtres” those who are not called as such by roleDesc or other characters in the play, speech identification is for the moment the only option for answering the research question.
New research vistas
When observing the trajectory of this study, the digital approach can be said as having significantly modified the initial perspective and research questions. The TEI tagging of characters and scenes revealed the need to refine methods and categories developed by the narratology and the theatrical analysis in 1970 and 1980, preparing the grounds for more accurate description of character building.
In the meantime, the effort to describe a stylistic specificity of the “petits-maîtres” revealed to be rooted in a larger, two folded research question, concerning the potential differences in speech between characters of the same author. Digital humanities have been, to date, more interested in global stylistic analysis on one side, and in authorship studies, on the other side, while this research project intends to apply the methods of the second approach to fictional speakers. While such an endeavour can be justified by the pretences of French dramatists of the classical centuries to act as kind of “sociolinguists”, the question is as well that of the methods to automatize the analysis of specific speeches when conducted on a large scale (more than 9300 distinct characters presents in the above-mentioned set of plays), and of the explanations to be found for the observed recurrences. Indeed, during the pilot study executed on Boissy plays, it appeared that his characters group, by their speech, less in male and female speakers, but rather in two unexpected classes, one formed by those who tend to speak more about themselves (characters saying “je”), the second by those who address themselves to the the others (characters saying “vous”). The importance of “I” and “You” has already been spotted in theatrical studies (see Craig, 2004), but not the observed repartition, which needs, however, to be checked against the larger corpus, a very time consuming operation for the moment.
It is to be expected that further observation of concordancers and n-grams extracted from very large collections of digitised plays will put into light such other unexpected recurrences, triggering a new understanding about what is a classical play, and contributing to the on-going debate about the nature of the theatre.