The digital preservation gap(s)

Somaya’s engaging, reflective piece identifies gaps in the wider digital preservation field and provides insightful thoughts as to how the gaps can be narrowed or indeed closed.


I initially commenced this post as a response to the iPres 2016 conference and an undercurrent that caught my attention there – however, really it is a broader comment on field of digital preservation itself. This post ties into some of my thoughts that have been brewing for several years about various gaps I’ve discovered in the digital preservation field. As part of the Polonsky Digital Preservation Project, I hope we will be able to do some of the groundwork to begin to address a number of these gaps.

So what are these gaps?

To me, there are many. And that’s not to say that there aren’t good people working very hard to address them – there are. (I should note that these people often do this work as part of their day jobs as well as evenings and weekends.)

Specifically, the gaps (at least the important ones I see) are:

  • Silo-ing of different areas of practice and knowledge (developers, archivists etc.)
  • Lack of understanding of working with born-digital materials at the coalface (including managing donor relationships)
  • Traditionally-trained archivists, curators and librarians wanting a ‘magic wand’ to deal with ‘all things digital’
  • Tools to undertake certain processes that do not currently exist (or do not exist for the technological platform or limitation archivists, curators, and librarians are having to work with)
  • Lack of existing knowledge of command line and/or coding skills in order to run the few available tools (skills that often traditionally-trained archivists, curators, and librarians don’t have under their belt)
  • Lack of knowledge of how to approach problem-solving

I’ve sat at the nexus between culture and technology for over two decades and these issues don’t just exist in the field of digital preservation. I’ve worked in festival and event production, radio broadcast and as an audiovisual tech assistant. I find similar issues in these fields too. (For example, the sound tech doesn’t understand the type of music the musician is creating and doesn’t mix it the right way, or the artist requesting the technician to do something not technically possible.) In the digital curation and digital preservation contexts, effectively I’ve been a translator between creators (academics, artists, authors, producers etc.), those working at the coalface of collecting institutions (archivists, curators and librarians) and technologists.

To me, one of the gaps was brought to the fore and exacerbated during the workshop: OSS4Pres 2.0: Building Bridges and Filling Gaps which built on the iPres 2015 workshop “Using Open-Source Tools to Fulfill Digital Preservation Requirements”. Last year I’d contributed my ideas prior to the workshop, however I couldn’t be there in person. This year I very much wanted to be part of the conversation.

What struck me was the discussion still began with the notion that digital preservation commences at the point where files are in a stable state, such as in a digital preservation system (or digital asset management system). Appraisal and undertaking data transfers wasn’t considered at all, yet it is essential to capture metadata (including technical metadata) at this very early point. (Metadata captured at this early point may turn into preservation metadata in the long run.)

I presented a common real-world use case/user story in acquiring born-digital collections: A donor has more than one Mac computer, each running different operating systems. The archivist needs to acquire a small selection of the donor’s files. The archivist cannot install any software onto the donor’s computers, ask them to install any software and only selected the files must be collected – hence, none of the computers can be disk imaged.

The Mac-based tools that exist to do this type of acquisition rely on Java software. Contemporary Mac operating systems don’t come with Java installed by default. Many donors are not competent computer users. They haven’t installed this software as they have no knowledge of it, need for it, or literally wouldn’t know how to. I put this call out to the Digital Curation Google Groups list several months ago, before I joined the Polonsky Digital Preservation Project. (It followed on from work that myself and my former colleagues at the National Library of Australia had undertaken to collect born-digital manuscript archives, having first run into this issue in 2012.) The response to my real-world use case at iPres was:

This final option is definitely not possible in many circumstances, including when collecting political archives from networked environments inside government buildings (another real-world use case I’ve had first-hand experience of). The view was that anything else isn’t possible or is much harder (yes, I’m aware). Nevertheless, this is the reality of acquiring born-digital content, particularly unpublished materials. It demands both ‘hard’ and ‘soft’ skills in equal parts.

The discussion at iPres 2016 brought me back to the times I’ve previously thought about how I could facilitate a way for former colleagues to spend “a day in someone else’s shoes”. It’s something I posed several times when working as a Producer at the Australian Broadcasting Corporation.

Archivists have an incredible sense of how to manage the relationship with a donor who is handing over their life’s work, ensuring the donor entrusts the organisation with the ongoing care of their materials. However traditionally trained archivists, curators and librarians typically don’t have in-depth technical skillsets. Technologists often haven’t witnessed the process of liaising with donors first-hand. Perhaps those working in developer and technical roles, which is typically further down the workflow for processing born-digital materials need opportunities to observe the process of acquiring born-digital collections from donors. Might this give them an increased appreciation for the scenarios that archivists find themselves in (and must problem-solve their way out of)? Conversely, perhaps archivists, curators and librarians need to witness the process of developers creating software (especially the effort needed to create a small GUI-based tool for collecting born-digital materials from various Mac operating systems) or debug code. Is this just a case of swapping seats for a day or a week? Definitely sharing approaches to problem-solving seems key.

Part of what we’re doing as part of the Polonsky Digital Preservation Project is to start to talk more holistically, rather than the term ‘digital preservation’ we’re talking about ‘digital stewardship’. Therefore, early steps of acquiring born-digital materials aren’t overlooked. As the Policy and Planning Fellow at Cambridge University Library, I’m aware I can affect change in a different way. Developing policy –  including technical policies (for example, the National Library of New Zealand’s Preconditioning Policy, referenced here) – means I can draw on my first-hand experience of acquiring born-digital collections with a greater understanding of what it takes to do this type of work. For now, this is the approach I need to take and I’m looking forward to the changes I’ll be able to influence.


Comments on Somaya’s piece would be most welcome. There’s plenty of grounds for discussion and constructive feedback will only enhance the wider, collaborative approach to addressing the issue of preserving digital content.

Comments are closed.