The project “Repository of Styrian Cultural Heritage” aims to build a digital archive of cultural heritage objects to make digitized and scholarly annotated collections from various institutions (archives, museums or universities) available for the public. To implement accurate retrieval functionalities and special visualizations a consistent data pool is indispensable. Furthermore, to guarantee data longevity and the possibility to analyse, re-use or add to the collected data, a well-documented management plan is of great importance. Contrary to possible assumptions that the increasing number of similar portals means that comparable documentations already exist in abundance, most portals either do not have any kind of documentation (e.g. Kulturerbeportal-Niedersachsen, BLO, DG-Kulturerbe) or strongly focus on details (e.g. metadata and mapping guidelines or specify controlled lists for comparability) (e.g. bavarikon, DDB, Europeana, museum-digital, MIMO). They do not offer extensive guidelines, especially for the creation of digital collections from scratch. At the Centre for Information Modelling in Graz we designed a management plan for this project according to the OAIS reference model and taking into account aspects concerning research data management (e.g. Puhl et al., 2015), for all partners involved.
Because we gather data from more than one preservation infrastructure (e.g. FEDORA-based repository GAMS) and deal with a variety of resources, from text-centred materials and images to museum objects and artefacts from various contexts (e.g. criminology or archaeology), the designed strategy is generic and expandable. The poster will show systematic strategies regarding different aspects ranging from evaluation to content curation and data re-use. The steps include:
Analysis & Evaluation: The starting-point is to evaluate and analyse the structure of the diverse sources. It involves a discussion about the goal and scope of the project in general as well as possible publication scenarios and functionalities for the collected data. At this stage, communication between technically skilled humanists and specialists for the respective collections is crucial. This collaboration saves time and effort in later stages of the project. Based on this foundation, workflows and data models exactly suited to the needs of the specific institution, including legal aspects, can be established.
Modelling & Metadata Generation: After these evaluations, a data model, including obligatory object descriptions as a minimum requirement for all object types along with rules for metadata creation, has to be developed. Furthermore the depth of the annotation and the use of controlled vocabularies for semantic enrichment have to be defined.
Quality Control & Preserving Data: Quality control of data is an integral part and takes place at various stages, during data collection or digitization. Already existing digital data has to be reviewed and if necessary revised according to the defined rulesets. In this respect, the documentation of data provenance (origins of data, revision agreements and transformation scenarios) is an essential step. Gathered and harvested data has to be transformed from proprietary file formats to suitable storage formats. Internationally recognized standards like DC for basic descriptive metadata, TEI for manuscripts or LIDO for museum objects are recommended to maximize possible re-use and interoperability of the data. For the web portal, all objects are going to be mapped to EDM. The model forms the common ground for the general object description and semantic enrichment.
For long-term preservation, platform-independent systems and open-source software like a Fedora-based repository should be used (e.g. Stigler, Steiner 2015). All objects need persistent identification, which ensures its availability and citability.
Access & Dissemination: Based on consistent data, various search options (filter, facet or advanced) can be implemented. Next to a general representation for all artefacts, elaborate object-specific functionalities and visualizations, like the possibility to thumb through manuscripts or view postal routes of correspondence on a historic map, can be offered.
Adding & Re-use: The possible re-use of data and the addition of new objects or scientific findings as well as the further enrichment of metadata also have to be considered. Adding information according to the defined guidelines guarantees the same quality for re-use, whether for follow-up research, teaching or browsing.
The poster will generally introduce the approaches to a common web portal for a variety of digital resources. It will present a best practice model for an interdisciplinary, cross-institutional cultural heritage project, based on experiences concerning data preservation, consistent metadata description and enrichment. In this respect it also provides guidance to implement a long-lasting expandable digital archive.