Cambridge Policy & Planning Fellow, Somaya, writes about her paper and presentation from the Digital Culture Heritage Conference 2017. The conference paper, Planning for the End from the Start: an Argument for Digital Stewardship, Long-Term Thinking and Alternative Capture Approaches, looks at considering digital preservation at the start of a digital humanities project and provides useful advice for digital humanities researchers to use in their current projects.
In August I presented at the Digital Cultural Heritage 2017 international conference in Berlin (incidentally, my favourite city in the whole world).
I presented the Friday morning Plenary session on Planning for the End from the Start: an Argument for Digital Stewardship, Long-Term Thinking and Alternative Capture Approaches. Otherwise known as: ‘planning for your funeral when you are conceived’. This is a presentation that represents challenges faced by both Oxford and Cambridge and the thinking behind this has been done collaboratively by myself and my Oxford Policy & Planning counterpart, Edith Halvarsson.
We decided it was a good idea to present on this topic to an international digital cultural heritage audience, who are likely to also experience similar challenges as our own researchers. It is based on some common digital preservation use cases that we are finding in each of our universities.
A Digital Humanities project receives project funding and develops a series of digital materials as part of the research project, and potentially some innovative tools as well. For one reason or another, ongoing funding cannot be secured and so the PIs/project team need to find a new home for the digital outputs of the project.
We have numerous examples of these situations at Cambridge and Oxford. Many projects containing digital content that needs to be ‘rehoused’ are created in the online environment, typically as websites. Some examples include:
- Broadside Ballads Online
- The Casebooks Project
- Cairo Genizah Fragments (Preservation Master files are stored in a legacy DSpace instance)
We believe that thinking holistically right at the start of a project can provide options further down the line, should an unfavourable funding outcome be received.
So it is important to consider holistic thinking, specifically a Digital Stewardship approach (incorporating Digital Curation & Digital Preservation).
Models for Preservation
Digital materials don’t necessarily exist in a static form and often they don’t exist in isolation. It’s important to think about digital content as being part of a lifecycle and managed by a variety of different workflows. Digital materials are also subject to many risks so these also need to be considered.
Some models to frame thinking about digital materials:
- Three-Legged Stool for Digital Preservation
- Digital Curation Centre Lifecycle
- CLOCKSS Threats and Mitigations Model and Strategy
- NDSA Levels of Preservation
It is incredibly important to document your project and when handing over the responsibility of your digital materials and data, also handing over documentation to someone responsible for hosting or preserving your digital project will need to rely on this information. Also ensuring the implementation of standards, metadata schemas and persistent identifiers etc.
This can include providing associated materials, such as:
- the Data Model (example from Design & Art Australia Online)
- Systems Diagrams / System Dependencies Diagrams
- Metadata Standards Crosswalks
- Controlled Lists
- Data Management Plans
Data Management Plans
Some better use of Data Management Plans (DMPs) could be:
- Submitting DMPs alongside the data
- Writing DMPs as dot-points rather than prose
- Including Technical Specifications such as information about code, software, software versions, hardware and other dependencies
An example of a DMP from Cambridge University’s Dr Laurent Gatto: Data Management Plan for a Biotechnology and Biological Sciences Research Council
Borrowing from Other Disciplines
Rather than having to ‘rebuild the wheel’, we should also consider borrowing from other disciplines. For example, borrowing from the performing arts we might provide similar documents and information such as:
- Technical Rider (a list of requirements for staging a music gig and theatre show)
- Stage Plots (layout of instruments, performers and other equipment on stage)
- Input Lists (ordered list of the different audio channels from your instruments/microphones etc. that you’ll need to send to the mixing desk)
For digital humanities projects and other complex digital works, providing simple and straight forward information about data flows (including inputs and outputs) will greatly assist digital preservationists in determining where something has broken in the future.
Several examples of Technical Riders can be found here:
Here are some approaches to consider in regards to interim digital preservation of digital materials:
Bundling & Bitstream Preservation
The simplest and most basic approach may be to just zip up files and undertake bitstream preservation. Bitstream preservation only ensures that the zeroes and ones that went into a ‘system’ come out as the same zeroes and ones. Nothing more.
Exporting / Migrating
Consider exporting digital materials and/or data plus metadata into recognised standards as a means of migrating into another system.
For databases, the SIARD (Software Independent Archiving of Relational Databases) standard may be of use.
Consider hosting code within your own institutional repository or digital preservation system (if your organisation has access to this option) or somewhere like GitHub or other services.
Packing it Down & ‘Putting on Ice’
You may need to consider ‘packing up’ your digital materials and doing it in a way that you can ‘put it on ice’. Doing this in a way that – when funding is secured in the future – it can be somewhat simply be brought back to life.
An example of this is the the work that Peter Sefton, from the University of Sydney in Australia, has been trialling. Based on Omeka, he has created a version of code called OzMeka. This is an attempt at a standardised way of being able to handle research project digital outputs that have been presented online. One example of this is Dharmae.
Alternatively, the Kings Digital Lab, provide infrastructure for eResearch and Digital Humanities projects that ensure the foundations of digital projects are stable from the get-go and mitigates risks regarding longer-term sustainability of digital content created as part of the projects.
This could be done through traditional web archiving approaches, such as using tools Web Archiving Tools (Heritrix or HTTrack) or downloading video materials using Video Download Helper for video. Alternatively, if you are part of an institution, the Internet Archive’s ArchiveIt service may be something you want to consider and can work with your institution to implement this.
Hosted Infrastructure Arrangements
Finding another organisation to take on the hosting of your service. If you do manage to negotiate this, you will need to either put in place a contract or Memorandum of Understanding (MOU) as well as handing over various documentation, which I have mentioned earlier.
Video Screen Capture
A simple way of attempting to document a journey through a complex digital work (not necessarily online, this can apply to other complex interactive digital works as well), may be by way of capturing a Video Screen Capture.
Alternatively, recording a journey through an interactive website using the Webrecorder, developed by Rhizome, which will produce WARC web archive files.
Documenting in Context
Another means of understanding complex digital objects is to document the work in the context in which it was experienced. One example of this is the work of Robert Sakrowski and Constant Dullart, netart.database.
An example of this is the work of Dutch and Belgian net.artists JODI (Joan Heemskerk & Dirk Paesmans) shown here.
Borrowing from documenting and archiving in the arts, an approach of ‘documenting around the work‘ might be suitable – for example, photographing and videoing interactive audiovisual installations.
Web Archives in Context
Another opportunity to understand websites – if they have been captured by the Internet Archive – is viewing these websites using another tool developed by Rhizome, oldweb.today.
An example of the Cambridge University Library website from 1997, shown in a Netscape 3.04 browser.
While there is no one perfect solution and each have their own pros and cons, using an approach that combines different methods might make your digital materials available post the lifespan of your project. These methods will help ensure that digital material is suitably documented, preserved and potentially accessible – so that both you and others can use the data in an ongoing manner.
- How you want to preserve the data?
- How you want to provide access to your digital material?
- Developing a strategy including several different methods.
Finally, I think this excerpt is relevant to how we approach digital stewardship and digital preservation:
“No man is an island entire of itself; every man is a piece of the continent, a part of the main” – Meditation XVII, John Donne
We are all in this together and rather than each having to troubleshoot alone and building our own separate solutions, it would be great if we can work to our strengths in collaborative ways, while sharing our knowledge and skills with others.