DH 2016 Abstracts

RDA/ADHO Workshop: Evaluating Research Data Infrastructure Components and Engaging in their Development

1. Summary

The purpose of this workshop is to conduct a meaningful examination of the data fabric and infrastructure components being defined by the Research Data Alliance (RDA), to test their relevance and applicability to the needs of the digital humanities community, and to discuss opportunities for humanities engagement in further standards development.

RDA is an international initiative to facilitate the development of effective data practices, standards and infrastructure in particular research domains, and across domains. It aims to enhance capacity to archive, preserve, analyze and share data, and for collaboration both within and across research communities. The humanities have an important presence in RDA, and can benefit from the opportunities RDA provides to learn across research communities working to develop digital infrastructure. RDA also brings together diverse types of technical expertise, which is organized to put forward (best practice) “adoption products.” Some of these products are starting to be taken up in the humanities, such as the Practical Policy Recommendations mentioned below, and there is significant potential for further collaborative work between RDA and digital humanities developers in the future.

Much of the infrastructure needed to support data sharing is of great relevance to Digital Humanities projects, where we find ourselves too often developing and reinventing ad-hoc solutions for data management, draining resources that could be put to better use focusing on the domain-specific nature of our problems and driving new research. It’s easy, especially when time and resources are constrained, to get locked into thinking that our problems are unique and that we need to design custom solutions, but when we examine the problem from other perspectives, the abstractions begin to rise to the surface. But in order to take advantage of the solutions as they are built, we must be part of the discussion about the requirements, push for our use cases to be considered in their design, and take part in testing, implementing and sustaining the solutions.

This will be a full day workshop in the format of a hands-on round-table and open discussion. Participants will be asked to come prepared to discuss details of their particular use cases, as well as solutions and needs relevant to two of the initial outputs of the RDA: the Persistent Identifier (PID) Types (Weigel, et. al., 2015) and Data Types Registry (DTR) (Lannom, et. al., 2015). In advance of the workshop, organizers will provide summaries of these outputs and detailed examples of their analysis for use cases in humanities and other relevant domains.

This workshop is a complement to the panel by Dr. Natalie Harrower et. al. entitled “Digital data sharing: the opportunities and challenges of opening research”. The panel presents particular challenges in humanities research data management, and aims to generate a discussion around the uniqueness and challenges inherent in humanities research data. This workshop, on the other hand, is a hands-on effort to work with real humanities data use-cases, provided by participants, to understand how to best shape RDA outputs to enable better data sharing and management in the humanities.

2. Format of the Workshop

In the first two hours of the workshop, organizers will present a summary of humanities activities in RDA thus far, and describe current calls for participation by RDA working and interest groups.

These calls address an array of topics important in the digital humanities: Institutional Review Boards (IRB), access solutions and metadata standards that support data sharing; the need for institutional repositories for both live and archived digital projects; the need for a “data net” to connect globally distributed repositories, enabling discovery and access; the need for cultural and organizational changes in the humanities to research data sharing and open scholarship.

Specific RDA activities covered will include:

The Digital Practices in History and Ethnography Interest Group (DPHE-IG) (https://rd-alliance.org/node/508), chaired by anthropologists Mike Fortun and Kim Fortun at Rensselaer, and Jason Jackson, Director of the Mathers Museum of World Cultures at the University of Indiana.
The successful adoption by the Platform for Experimental and Collaborative Ethnography (PECE) (http://worldpece.org) of the RDA Practical Policies Recommendations, a specification for best practices for data management.
Two nascent RDA Working Groups: the Research Data Collections WG (https://rd-alliance.org/groups/pid-collections-wg.html) and the WG on Empirical Humanities Metadata.

We will wrap up the first half of the morning session with a group effort to identify the range of solutions and support needed for research data management and sharing in the digital humanities in coming years, and potential opportunities for RDA collaboration. The list generated will be taken back to the RDA community for their consideration and feedback.

The second half of the morning will be devoted to an in-depth presentation of the RDA PID Types and DTR outputs, including a demonstration of their implementation.

After lunch, each participant will be invited to present their use case/requirements and engage with workshop organizers and participants in a discussion of the relevance and gaps and cost/benefit of adopting the solutions being proposed by RDA. Prior to the workshop the organizers will issue a short survey for participants to answer directed questions about their requirements as well as a template for more detailed descriptions of their use cases. Specific focus will be on the PID Types and DTR solutions, but other relevant outputs or in-progress efforts may be considered as well. Workshop organizers will take notes and produce a summary report following the workshop to share with the RDA community for their consideration and feedback.

3. Target Audience

Members of the ADHO community who are interested in collaborating with a global multi-disciplinary community to define, develop, test and adopt infrastructure for supporting the management, preservation and sharing of humanities research data. Participants should have some experience with digital humanities projects for which general solutions for working with Persistent Identifiers and machine actionable Data Types are relevant.

4. Workshop Leaders

Bridget Almas is a Senior Software Developer for the Perseus Digital Library at Tufts University (http://www.perseus.tufts.edu) and co-PI of the Perseids Project (http://www.perseids.org), a collaborative online environment for creating and publishing datasets consisting of transcriptions, translations, linguistic annotations and commentaries of and on ancient source documents. She is a co-chair of the RDA Research Data Provenance Interest group, and a former member of the RDA Technical Advisory board.

Kim Fortun is a cultural anthropologist and Professor of Science & Technology Studies at Rensselaer Polytechnic Institute. From 2005-2010, Fortun co-edited the Journal of Cultural Anthropology (http://www.culanth.org/), as it was developing its original digital infrastructure. Fortun has played a lead role in the development of the Platform for Experimental and Collaborative Ethnography (PECE), an open source/access online work space for anthropological and historical research. Fortun co-chairs the RDA DPHE Interest Group.

Dr. Natalie Harrower is the Director of the Digital Repository of Ireland (DRI) (http://www.dri.ie/) - a publicly accessible online repository for the long-term preservation, sharing and reuse of humanities and social science data. In addition to building an open-source trusted digital repository to international standards, the DRI also influences policy and publishes reports and guidelines on data archiving, metadata standards, preservation infrastructures, Linked Data, digital preservation challenges unique to cultural data, and digital humanities. DRI is piloting a research data management project with diverse digital arts and humanities data, and their current flagship project is the award-winning digital cultural heritage site Inspiring Ireland (inspiring-ireland.ie).

Eveline Wandl-Vogt, research manager at the Austrian Academy of Sciences, Austrian Centre for digital Humanities and VCC1 Co-Chair eInfrastructures, COST ENeL; expert in several national and international committees – mainly focussing on standardisation, interoperability, Social Innovation and Open Science. Recently, she mainly focuses on supporting transformation processes from the Humanities towards Interdisciplinary Humanities and applied Humanities in the framework of Open Science, Citizen Science and Open Innovation.

Bibliography

Lannom L., Broeder, D. and Manepalli, G. (2015). Data Type Registries Working Group Output, https://rd-alliance.org.
Weigel T., DiLauro, T. and Zastrow, T. (2015). PID Information Types WG Final Deliverable, https://rd-alliance.org.