DH 2016 Abstracts

The School of Salamanca on the Semantic Web

Launched in 2013, the project The School of Salamanca. A Digital Collection of Sources and a Dictionary of its Juridical-political Language is establishing a collection of more than a hundred sources from Iberian theologians and jurists of the 16^th and 17^th century. These texts deal with political and juridical topics and the collection of sources is supplemented by a dictionary that comprises, next to biographical information on the authors of the sources, the development of central terms and ideas of the Western history of political and legal ideas, as it is reflected in the source texts.

Both parts, the sources and the dictionary, will be published under Open Access conditions. In the beginning of 2016, the project’s web application will be launched online with the first batch of sources and dictionary entries as TEI-XML texts plus corresponding facsimile images. At the moment, we are running a proof-of-concept mechanism and some experiments, which at the time of release will be constituting a Linked Open Data interface to our data along with a SPARQL-Endpoint.

At the beginning of the presentation the project’s rationale and web application will be introduced shortly. But the focus of the presentation will be on giving insights in the workflow, the decision making process and the implementation of the LOD mechanisms, that have been realized:

1. The first aspect accentuated in the presentation is the modelling of the information contained in our TEI-XML data within a Linked-Data-environment: Which TEI elements or attributes are assigned to which objects and predicates of which ontology? How are these assignments processed in order to offer the data as semantic data? Problems that we are dealing with are the questions of how to record the temporal dimension of much biographical information (see Ramos, 2009; Mynarz, 2013), and how to cope with alternative values like e.g. conflicting data about the date of birth of a person? Do considerations such as these affect our main objectives, e.g. the TEI-scheme or the collection of data?

Here is what has been settled up to now: We will offer semantic data about the sources and about the authors of the collection. The ontologies we use are mainly the foaf-, bio-, relationship-, and SPAR-ontologies (see, among others, Peroni, 2014). The TEI-data will be transformed into RDF by the xtriples webservice (Schrade, 2015).

2. The second highlighted aspect concerns the very networking of the data itself and its utilization in the project’s infrastructure. This concerns technical issues, such as the questions of which services and resources should be – directly or indirectly – provided in order to offer our data for external reuse? It concerns issues of academic strategies such as negotiations with favored partner projects and data providers over the data they expose and the interfaces they provide. But it also concerns scholarly questions, such as eventual opportunities to handle new research questions, or to handle questions in new ways, opened up by the integration of our data with the data of other LOD-providers. In which form could or should such expanded possibilities be provided on the publicly accessible website? What stance are we taking on rights management and quality insurance of federated queries/data?

Again, here is the current status: We have mechanisms for the resolution of resource URIs, for content negotiation and for dumping the complete triple store as well as a SPARQL endpoint in place. We are elaborating federated queries responding to specific, concrete research questions. And we are still in the process of evaluating the Linked Data Fragments (Verborgh, 2014) mechanism. We are in contact with several projects the data of which would nicely complement our own (Schmutz, 2008; Sytsma, 2010; Bullón, 2012; Mrozik, 2016). And we are probing possibilities of offering a configurable interface to (federated) querying to our users and of rendering network information visually in our web application. Except most likely for the last point (due to time constraints), the talk will present the state of affairs we will have achieved in summer 2016.

The presentation will thus point out conceptions and their implementations of linking TEI resources into the semantic web, difficulties encountered and needs still left open by scholarly research questions.

Bibliography

Bullón, Xavier Agenjo (2012). Introducción: la Biblioteca Virtual de la Escuela de Salamanca y Linked Open Data. http://dx.doi.org/10.18558/FIL (accessed 8 April 2016).
Ciotti, Fabio et al. (2014). TEI, Ontologies, Linked Open Data: Geolat and Beyond. https://jtei.revues.org/1365 (accessed 8 April 2016).
Eide, Øyvind (2015). Ontologies, Data Modeling, and TEI. https://jtei.revues.org/1191 (accessed 8 April 2016).
Meroño-Peñuela, Albert et al. (2014). Semantic Technologies for Historical Research. A Survey. http://www.semantic-web-journal.net/system/files/swj588_0.pdf (accessed 8 April 2016).
Mrozik, Dagmar (2016). The Jesuit Science Network (Wuppertal/Berlin, being established in 2016). http://jesuitscience.net/ (accessed 8 April 2016).
Mynarz, Jindřich (2013). Capturing temporal dimension of linked data. http://blog.mynarz.net/2013/07/capturing-temporal-dimension-of-linked.html (accessed 8 April 2016).
Pattuelli, Cristina M. (2012). FOAF in the Archive: Linking Networks of Information with Networks of People. Final Report to OCLC. http://www.oclc.org/content/dam/research/grants/reports/2012/pattuelli2012.pdf (accessed 8 April 2016).
Peroni, Silvio (2014). The Semantic Publishing and Referencing Ontologies: 10.1007/978-3-319-04777-5_5.
Ramos, Michele R. (2009). Biography Light Ontology. An Open Vocabulary For Encoding Biographic Texts. http://metadata.berkeley.edu/BiographyLightOntology.pdf (accessed 8 April 2016).
Romanello, Matteo and Pasin, Michele (2013). HuCit. https://bitbucket.org/56k/hucit/ (accessed 8 April 2016).
Ruiz-Iniesta, Almudena and Corcho, Oscar (2014). A review of ontologies for describing scholarly and scientific documents. http://ceur-ws.org/Vol-1155/paper-07.pdf (accessed 8 April 2016).
Sch mutz, Jacob (2008). Scholasticon. Ressources en ligne pour l'étude de la scolastique moderne (1500-1800): auteurs, sources, institutions. http://scholasticon.ish-lyon.cnrs.fr/Presentation/index_fr.php (accessed 8 April 2016).
Sytsma, David (2010). Post-Reformation Digital Library. http://www.prdl.org/ (accessed 8 April 2016).
Verborgh, Ruben et al. (2014). Querying Datasets on the Web with High Availability. In: Proceedings of the 13th International Semantic Web Conference: 10.1007/978-3-319-11964-9_12.
Wettlaufer, Jörg et al. (2015). Semantic Blumenbach: Exploration of Text-Object Relationships with Semantic Web Technology in the History of Science: 10.1093/llc/fqv047.
Schrade, Torsten (2015). xTriples. A generic webservice to extract RDF statements from XML resources. http://xtriples.spatialhumanities.de/index.html (accessed 8 April 2016).