Digital preservation is a mature concept, but we need to pitch it better

Cambridge Technical Fellow, Dave, presents his thoughts on the OAIS and his own elevator pitch about digital preservation from the Pericles/DPC Acting on Change conference in London, last week.


Some of the best discussions at the Pericles / DPC Acting on Change conference came during the morning panel sessions. In the first, provocatively titled “Beyond the OAIS”, Barbara Sierman, from The KB National Library of the Netherlands, admitted that the OAIS can be confusing for newcomers… and as a newcomer to digital preservation, I agree!

Fellow panellist Barbara Reed, from Recordkeeping Innovation, suggested the OAIS’s Administration function as a potentially-confusing area, and this too struck a chord. I’ve gained some systems analysis and modelling experience over the years, and my first thought looking at the OAIS was that the Admin function looked like a place where much of the hard-to-model, human stuff had been separated from the technical, tool-based parts. (I’ve seen this happen before in other domains…)

There’s actually a hint that this is happening in the standard’s diagram for the Admin function – it’s busier and more information-packed than the other function diagrams, which tends to be a sign that it’s a bit of a ‘bucket’ which needs more modelling. This led me to an immediate concern that Admin doesn’t sit easily within the overall standard, and I think Barbara Reed had picked up on this too, suggesting that two more focused documents – one ‘technical’, one ‘human’ – might make the standard easier to use.

Then Artefactual Systems’ Dan Gillean asked who we should be talking to about the OAIS outside of the community? Barbara Reed answered ‘Enterprise Architects’; and two of the things Enterprise Architects use in their work are domain models and pattern languages. I was glad Barbara made this point, because I had already come to a similar conclusion.

AV Preserve’s Kara Van Malssen replied ‘communications experts’ to Dan’s question, suggesting Marketing in particular, though perhaps skilled science communicators might be even better? (Both Cambridge and Oxford – among others – put a lot of effort into public engagement with research, and there is a healthy body of research literature about it).

And the importance of communication was further emphasised by Nancy McGovern (MIT Libraries) and Neil Beagrie (Charles Beagrie Ltd) during the second day’s panel session (Preparing for Change). Nancy used the phrase ‘Technical Author’ at one stage – and it occurred that such input might be a very quick win for the OAIS Reference Implementation? Meanwhile, Neil talked about needing a short, pithy statement that explains what we do to funders…

So here’s an attempt at an Elevator Pitch:

Digital Preservation means sourcing computer-based material that is worthy of preservation, getting that material under control, and then maintaining the usefulness of that material, forever.

This Elevator Pitch is part of the pattern language I’m working on with my fellow Polonsky Fellows, and (I hope, soon) the broader Digital Preservation community. (We’re still thinking about that last ‘forever’, but considering how old some of the things in our libraries are, ‘forever’ seems an easy way of thinking about it).

The key point that Nancy McGovern made, however, was that we’re ready to take Digital Preservation to a wider audience. I think she’s right. The OAIS is confusing – it’s a real head-scrambler for a newcomer like me – but it has reached a level of maturity: it’s clear how much deep thought and expertise underpins it. And, of course, the same goes for the technology it has influenced over the previous decades. This supports what Arkivum’s Matthew Addis said in the second day’s keynote – the digital preservation community is ready to take their ideas to the world: we perhaps just need to pitch them a little better?

A digital preservation pattern language

Technical Fellow, Dave, shares his final update from PASIG NYC in October. It includes his opinions on digital preservation terminology and his development of an interpretation model for mapping processes.


Another of the sessions at the PASIG NYC conference we attended concerned standardisation. It started with Avoiding the 927 Problem: Standards, Digital Preservation, and Communities of Practice by Artefactual Systems’ Dan Gillean, which explained the relationships between De Jure / De Facto, and Open / Proprietary standards, and which introduced the major Digital Preservation standards. Then later in the session, Sibyl Schaefer (@archivelle) from the UCSD Chronopolis Network presented Here we go again down this road: Certification and Recertification, which covered the ISO standardisation terminology (e.g. Certification vs Accreditation) and went deeper into the formal (De Jure) standards, in particular the Open Archival Information System (OAIS) reference model (ISO 14721) and the Audit and Certification of Trustworthy Digital Repositories (ISO 16363).

One aspect of Dan Gillean’s presentation that resonated with me was his discussion of the Communities of Practice that had emerged around the Digital Preservation standards. This reminded me of a software development concept called design patterns, which has its roots in (real) architecture, and in particular a book called A Pattern Language: towns, buildings, construction, by Christopher Alexander (et al). This proposes that planners and architects develop a ‘language’ of architecture so that they can learn from each other and contribute their ideas to a more harmonious, better-planned whole of well-designed cities, towns and countryside. The key concept they propose is that of the ‘pattern’:

The elements of this [architectural] language are entities called patterns. Each pattern describes a problem which occurs over and over again in our environment, and then describes the core of the solution to that problem, in such a way that you can use this solution a million times over, without ever doing it the same way twice (Alexander et al, 1977:x).

Each pattern has a common structure, including details of the problem it solves, the forces at work, the start and end states of related resources, and relationships to other patterns. (James Coplein has provided a short overview of a typical pattern structure). The idea is to build up a playbook of (de facto) standard approaches to common problems, and the types of behaviour that might solve them, as a way of sharing and reusing knowledge.

I asked around at PASIG to see if anyone had created a reusable set of Digital Preservation Patterns (somebody please tell me if so, it’ll save me heaps of work!), but I drew a blank. So I grabbed the Alexander book (I work in a building containing 18 million books!), and also had a quick look online. The best online resource I found was http://www.hillside.net/ – which contained lots of familiar names related to programming design patterns (e.g. Erich Gamma, Grady Booch, Martin Fowler, Ward Cunningham). But the original Alexander book also gave me an insight into patterns that I’d never heard of before, in particular the very straightforward way that its patterns related to each other from the general / high level (e.g. patterns about regional, city and town planning), via mid-level patterns (for neighbourhoods, streets and building design), to the extremely detailed (e.g. patterns for where to put beds, baths and kitchen equipment).

This helped me consider what I think are two issues with Digital Preservation: firstly, there’s a lot of jargon (e.g. ‘fixity’, ‘technical metadata’ or ‘file format migration’ – none of which are terms fit for normal conversation). Secondly, many of the Digital Preservation models mismatch concepts at different levels of abstraction and complexity: for example the OAIS places a discrete process labelled Data Management alongside another labelled Ingest, where Ingest is quite a specific, discrete step in the overall picture, but where there’s also a strong case for saying that the whole of Digital Preservation is ‘data management’, including Ingest itself.

Such issues of defining and labelling concepts are common in most computer-technology-related domains, of course, and they’re often harmful (contributing to the common story of failed IT projects and angry developers / customers etc). But the way in which A Pattern Language arranges its patterns at the same levels of abstraction and detail, and in doing so enables drilling-down through region / city / town / neighbourhood / street / building / room, provides an elegant example of how to avoid this trap.

Hence I’ve been working on a model of the Digital Preservation domain that has ‘elevator pitch’ and ‘plain English’ levels of detail before I get to the nitty-gritty of technical details. My intention is to group similarly-sized and equally-complex sets of Digital Preservation processes together in ways that help describe them in clear, jargon-free ways, hence forming a reusable set of patterns that help people work out how to implement Digital Preservation in their own organisational contexts. I will have an opportunity to share this model, and the patterns I derive from it, as it develops. Watch this space.

Alexander, C., Ishikawa, S., Silverstein, M., Jacobson, M., Fiksdahl-King, I. and Angel, S. (1977) A Pattern Language: towns, buildings, construction. 1st edn. New York: Oxford University Press.


Do you know of any work that’s been done to create a Digital Preservation Pattern Language? Would you like to contribute your ideas towards Dave’s idea of creating a playbook of Digital Preservation design patterns? Please let Dave know using the form below…