Sustaining the results of digital humanities research projects remains an ongoing challenge within the wider DH ecosystem. This is particularly the case for research infrastructure developments, where the scale, complexity and indeed the overarching aim of the project to serve a potentially still emerging research community makes their continued accessibility even more important, and even more difficult.
Like any enduring challenge, sustainability of complex digital resources has been subjected to a certain amount of scholarly investigation (though less overall than one might expect). What a survey of this literature shows, however, is that the perspectives on how to face the challenge of sustainability are highly dependent on how a project or infrastructure views itself: for example, when viewed primarily as a form of organisation or institution, the sustainability model proposed will likely focus on the necessary ‘business model’ for maintaining the services created (Maron et al., 2009). The Archives Portal Europe (APE1), for example, established itself as a Foundation after the end of the project, in order to maintain the vital functions of the infrastructure and to further connect with other projects and similar initiatives.
Alternatively, when project outputs are viewed as a tool or technical platform, a sustainability proposal will primarily take into consideration issues and practices such as migration and curation of elements such as the repositories where the data are stored and continued maintenance of work environments and specific tools. The TextGrid project2, for example, has been hugely successful in rolling its activities forward over a long period, continuing to make its services available to users. At its best, this approach results in a broad focus on software durability, documentation of processes and the modularity of services (Buddenbohm et al., 2015).
But to understand sustainability thoroughly we must also engage a second huge and unresolved issue in digital humanities, that is the reuse of project outputs and data. In particular, the “Log Analysis of Digital Resources in the Arts and Humanities, or the LAIRAH project (Warwick et al., 2008) has contributed significantly to our understanding of what factors enable digital projects and tools to be found and adopted by users. From the results of this project we can see another model for the sustainable project to emerge, in which the communication and branding of the project is a key element of its success.
This presentation brings forward the hypothesis that a successful approach to sustainability for Research Infrastructures needs to be comprehensive; an approach that doesn’t just consider data or technology, community, communications or processes, but in fact all of them simultaneously. In addition, it should focus not only on a project as a collection of tangible and intangible assets, but also on the potential user base for these assets, and what these users consider valuable about them.
Discussion of this user-centred approach to sustainability will be based on the experiences of the Collaborative Digital Archival Research Infrastructure (CENDARI3) project’s year-long sustainability planning exercise, conducted from January 2014-January 2015. This exercise, which built upon previous work in the project and a strong link to the Digital Research Infrastructure for the Arts and Humanities (DARIAH ERIC4), resulted in a set of principles and processes for mapping and sustaining user value from the project for the medium and long terms. Although both the generic process (which will be released as a sustainability toolkit at the end of the project) and the specific actions implemented by the project match on some level the specifics of the CENDARI development, they also reflect the reality, identified by Joris Van Zundert, of the “fluidity” of research infrastructure, caught up in both the digital information lifecycle and the creation of knowledge by end users, as well as the software components (Van Zundert, 2012).
The CENDARI sustainability planning process was comprised of a series of 4 stages, from pre-planning to closure and post-project actions, each of which contributed to the overall, holistic sustainability strategy. This cycle was intended to counteract a natural impetus within projects to view sustainability as a concern only for the final phase of the project, rather than one to be integrated into the project’s development and even its conception.
As a key component of the second and third phases of the CENDARI project sustainability planning process, the project carried out a thorough audit (including a stakeholder validation meeting) to refine its understanding of what assets the project had generated and how they could be maintained, shared and indeed passed on to its key users for further development. This process identified 7 categories of assets as most likely to find future usage, each of which posed unique challenges in how they could be captured, made visible and sustained. It has been one of the greatest challenges of the CENDARI sustainability planning process to ensure that for each of these areas we could find a solution, as we would for our personal work data, to make them findable and reusable in a contextualised manner, and preserve them in ‘multiple formats and multiple locations.’ For each asset type, the audience for potential future use is different, and therefore the solution proposed is as well.
The CENDARI portal is the most visible of its assets, representing the final synthesis of the project’s activities and its main point of access. For many projects, this would be where sustainability planning would not only begin, but end. CENDARI approached this sustainability challenge via a three pronged strategy, guaranteeing 3 years of access through the German arm of DARIAH but also ensuring new communities and new approaches would be recruited to continue development.
But the portal is not only useful in its complete final form, but also as a collection of unique services, tools and components optimised to support DH research. This possible reuse of the project outcomes was foreseen from the beginning, and a very modular, service oriented architecture was adopted for the project. The tools therefore require a sustainable pathway outside of the portal. That said, however, connecting tools with potential user bases is a constant challenge. The software community practice of using GitHub to share software was adopted, but further awareness raising was also required to ensure the maximal future use for the tools.
CENDARI holds a lot of data from different sources, some unique to the project, others well signposted elsewhere, and with different requirements and expectations for sustainability. This has been its legacy as a project seeking to reuse archival data for historical research, where the culture and ability to share data is unevenly developed. The CENDARI data portal gives access to this data, and the project’s data agreement and license have been developed with DARIAH as a co-signatory, so in many ways DARIAH had already agreed from an early point in the project to sustain this data. But DARIAH is not well-known as a data provider or source, and this solution alone may not maximise visibility and reuse. Therefore a redeposit protocol for unique data with an external trusted source has also been facilitated.
The Archival Research Guides exist as a particular subset of the data unique to CENDARI, but their status as both primary and secondary research sources justifies their consideration as an asset class in themselves, in particular because of the manner in which they challenge existing norms of publication, communication and evaluation in the discipline of history. As extended and enhanced publications, incorporating analysis, links to data sources, multimedia objects, and links to project ontologies, these guides need to be delivered within the project portal. But to sustain these unique works of scholarship only in that format would again potentially limit their visibility. They will therefore be offered in one or more export formats, as well as becoming the focus of both a review publication and a research paper to be submitted to a mainstream (not digital) historical journal. In this way, their contributions to scholarship can be recognised as independent from the format in which they have been delivered.
Given how particular many of the project experiences in building for the DH community had been, a specific audit of CENDARI’s tacit knowledge was also undertaken, and several white papers and process oriented toolkits have emerged from the project on the foot of this (including, for example, a ‘White Book of Archives’ documenting the project’s experience of federating highly heterogenous data from traditional collection holding institutions). As a related issue, some of the project’s management assets may also have a future utility for others.
Perhaps the least easily defined and sustained aspects of the CENDARI project will be the communities - mixed and homogenous groups of historians and other humanistic scholars, collections experts and technologists - it has brought together and formed. Interconnectivity between cognate projects will be a key resource for this, as some communities will have interests across these projects, and networks can and should be shared. But some of the community aspects are very unique to CENDARI, and will have a specific role in guiding the future use of the portal and its components: for this reason, the project will use another DARIAH mechanism, the working group, to provide a structure for continued development of project concerns and assets.
As can be seen from this description, CENDARI has made both its own sustainability and the potential future role of the DARIAH ERIC in the sustainability of medium- to large-scale digital projects in Europe into key areas of applied research and development. The resulting tool-kit for sustainability will hopefully assist future projects in extending both their sustainability planning and strategies in the future.