Initiated in 2007, the project Corpus Coranicum of the Berlin-Brandenburg Academy of Sciences and Humanities aims at building a comprehensive digital information system by providing access to relevant materials for the history of the Qur’an such as digitized versions of the oldest qur'anic manuscripts and their transliterations, comparisons of variant readings for each verse, texts from the environment of the Qur’an as well as providing commentaries for each sura, taking all the aforementioned elements into account. Both, manuscript evidence and variant readings, can be seen as the foundation for a future critical edition of the Qur'an. Following the German philological approaches to the history of the Qur'an before World War II such as the “Wissenschaft des Judentums” – a reform movement founded by Abraham Geiger (1810-1874) – and Gotthelf Bergsträßer's (1886-1933) “Korankomission” of the Bavarian Academy of Sciences, the Qur'an project in Potsdam is working on implementing a sustainable solution for exploring the history of the Qur’an, this time digitally.
This information system does not confine itself to the digital reproduction of the holy text but utilizes international standards like XML, Unicode and TEI to ensure the long-term readability and archivability of the conducted research, text analyses and editorial efforts. The print edition of the Qur'an produced in Cairo in 1924 is used as a reference for the documentation of the material for the textual history, since that print had a tremendous influence on following prints during the 20 th century. following the analytical approach of Theodor Nöldeke (1836-1930), the project produces rich philological commentaries for each sura, exposing their chronological order and putting emphasis of the development of their literary forms across the 22 years of the prophet's proclamation.
Viewing the Qur’an as a text proclaimed in Arabia in Late Antiquity, the Corpus Coranicum project provides access to a collection of testimonies labeled as “Texte aus der Umwelt des Korans” (“Texts from the Environment of the Qur'an”). There, texts in Hebrew, Syriac, Greekt, Arabic, Ancient South Arabian, Ethiopian and others are being gathered, transcribed and translated, in order to highlight intersections and point out differences between and other documents from Late Antique culture, religions and traditions: Thus, the messages of the Qur'an can be viewed and understood in their respective contexts and in a new light.
Furthermore, the project gathers archeological evidence and conducts radiocarbon datings of qur’anic parchments in an ongoing German-French cooperation (2011-2014 Coranica, from 2015-2018 Paleocoran) to contribute substantially to the understanding of the Qur'an's history and the emergence of Islam. A joint goal of Corpus Coranicum and Paleocoran is to bring together all manuscripts that were originally kept in old Cairo and are now scattered around the world in a digital format for presenting them in their original form and order. On top of mere digitizations of the manuscripts, the Corpus Coranicum provides modernized transliterations of the original Arabic scripture. These transliterations are being shown in a self-developed font “Coranica” since other Arabic fonts like MS Typesetting or Amiri fail to display all the necessary characters occurring in the relevant texts of the project.
Next to commentaries, contextualization and analyzing manuscripts, the Corpus Coranicum project is building a corpus of variant readings on a word-level. Since the earliest time, the various readers of the Qur’an have recited the text in their own way. For each sura, each verse and word-coordinate in the Qur’an, the project compares the variant readings according to the written source with each other in order to show the varying tradition and interpretation of the holy text.
With the variant data accumulated so far the variances cannot only be analyzed on a word-to-word basis, but they can also be utilized to compute a general similarity measure between readers of the Qur’an by mapping the variant readings into a vector space which can be represented as a multi-dimensional matrix. For each word occurring in Qur’an, the Qur’an matrix is being assigned a dedicated row. Each variant reading of that word at this particular sura-verse-word-coordinate creates a column in that row, assigning the variant reading as its value. The same procedure is then applied to create matrices for each reader of the Qur’an. Whereas the Qur’an matrix can have multiple non-empty values in a row, a reader matrix can only have one: the corresponding coordinate of the variant reading the reader uses at that particular sura, vers and word position. Since the cutting angle between two vectors or matrices represents the similarity to each other, the Euclidian distance (see below) is being utilized to compute that similarity measure.
All the aforementioned branches of the Corpus Coranicum project, the history of the text, the manuscripts, analyses of variant readings, the literary and chronological commentary as well as the texts surrounding and having influenced the Qur’an are bundled together to develop a new perspective on the text. The website of the Corpus Coranicum goes beyond a traditional digital edition of the Qur’an and can better be described as a digital framework or digital information system for the Qur’an, as a variety of tools and different texts are present.
The project tries to pick up and go digitally beyond where German Qur’anic science tradition has left off. On top of the content created and functionalities implemented thus far, the projects aims at extending its range of features by providing internationalized versions of the website (English, French, Turkish) as well as integrating the Rafi Talmon (1948-2004, Arabist, University of Haifa) concordance to offer chrono-morphological and statistical analyses of the holy text.
The talk will give an overview about the current state of the project, will portray philological approaches and their technical applications as well as present results of the similarity computations mentioned above.