A portable digital preservation roadshow kit

As a part of the lead up to Digital Preservation Day, the Cambridge team held a series of roadshows with a pop-up exhibition to raise awareness of digital preservation within the wider University. They wanted to let people know that there was a team that was concentrating in this area. They also wanted to find out people’s concerns regarding the long term continuity of the digital content that they create and digital content they use. Outreach and Training Fellow, Lee, writes about what is in the pop-up kit and how it can be used at your institution to generate awareness of digital preservation.


The exhibition kit

In the lead up to the exhibition we created a portable carry kit that so that we could repeat the exhibition in various locations day after day.

To stimulate discussion as well as having an interactive experience, the first portable exhibition consisted of:

  • An A1 poster, printed on cloth for ease of carrying and to reduce wear and tear. Images attributed as correctly as possible and in line with open and creative commons requirements.
Exhibition poster

Prototype exhibition poster.

  • A roll-up display banner with an image sourced from the Cambridge Digital Library (appropriately from the Book of Apocalypse), plus a bit of their Photoshop skills to make a corrupted version. I like to describe the image as the digital equivalent of mould affecting a precious manuscript. You can still see the image but it’s not quite right and so work needs to be done to put to ‘right’.
  • A laptop with the URLs to various playable games on the Internet Archive, to make the point about emulation and how digital is different from traditional media. The games we used were:
  • A small collection of tangible technology from the past to the present. This was sourced from the Fellows’ collections of materials and included:
    • 8” floppy disk
    • 25” floppy disk
    • 5.25” floppy disk
    • 5.25” floppy disk drive
    • Compact Disc Recordable (CD-R)
    • Commercial double sided film on Digital Versatile Disk (DVD)
    • Digital Versatile Disk ReWritable (DVD-RW)
    • A Hard Disk Drive 250GB from a laptop
    • 2GB and 1GB Randow Access Memory (RAM) chips
    • USB stick with the hard cases removed to show the small PCB and memory chip
    • An SD card enclosure
    • A 2GB micro SD card
    • A micro SD card USB enclosure
    • An iPod c. 2012
    • An acetate, c. 1990, with degradation (courtesy of JISC’s Dom Fripp) to make a visual point through an analogue item about the degradation and the fragile nature of materials we are working with.

A close up of the tech on display.

As a part of future work we’d like to develop this into a more generic display kit for those who do not have the time to create such materials, but have an opportunity to run displays. When it’s up and running, this is how the display looked in the University Library’s Entrance Hall.

Roadshow display at set up in the Entrance Hall of the Cambridge University Library.

We also relied on the generous acceptance and space from the hosting venues so that we could come and visit. It was important that we toured around the site to widen the message amongst the Cambridge University community, so we visited to following venues:

  • Alison Richard Building – 16th November
  • Gordon and Betty Moore Library – 17th November
  • Department of Engineering Library – 20th November
  • University Library Entrance Hall – 21st November
  • Churchill College – 22nd November
  • Faculty of English Social Space – 23rd November

The following is a summary of some of the views captured from the Post-It notes. As it’s not part of a proper study, we removed the views that repeated each other. The most popular answer for the “what digital materials should be saved” question was ‘all’ or ‘everything’. Most thought that the Library should be responsible for the preservation of all materials and the most common challenges were money, time, and reacting to change.

Summary of Post-It note capture.

There was a lot of work put into the creation of the pop-up exhibition and it was developed carefully so that it could be used beyond the life of the DPOC project. We have created a resource that can be used a moments notice to begin the digital preservation conversation to a wider audience. We’d like to develop this kit a bit further so it can be personalised for your own outreach efforts.


Please get in touch if you would like to collaborate on this kit in the comments below or via the ‘contact us’ page.

PASIG 2017: honest reflections from a trainee digital archivist

A guest blog post by Kelly, one of the Bodleian Libraries’ graduate digital archivist trainees, on what she learned as a volunteer and attendee of PASIG 2017 Oxford.


Amongst the digital preservation professionals from almost every continent and 130 institutions, myself and my 5 traineeship colleagues were amongst the lecture theatre seats, annexe demos and the awesome artefacts at the Museum of Natural History for PASIG 2017, Oxford. It was a brilliant opportunity at just 6 months into our traineeship to not only apply some of our new knowledge to work at Special Collections, Bodleian Libraries, but we were also able to gain a really current and relevant insight to theories we have been studying as part of our long distance MSc in Digital Curation at Aberystwyth University. The first ‘Bootcamp’ day was exactly what I needed to throw myself in, and it really consolidated my confidence in my understanding of some aspects of the shared language that is used amongst the profession (fixity checks, maturity models…as well as getting to grips with submission information packages, dissemination information packages and everything that occurs in between!).

My pen didn’t stop scribbling all three days, except maybe for tea breaks. Saying that, the demo presentations were also a great time for myself and other trainees to ask questions specifically about workflows and benefits of certain software such as LibNova, Preservica and ResourceSpace.

For want of a better word (and because it really is the truth) PASIG 2017 was genuinely inspiring and there were messages delivered so powerfully I hope that I stay grounded in these for my entire career. Here is what I was taught:

The Community is invaluable. Many of the speakers were quick to assert that sharing practice amongst the digital preservation community is key. This is a value I was familiar with, yet witnessing it happening throughout the conference in such a sincere manner. I can assure you the gratitude and affirmation that followed Eduardo del Valle, University of the Balearic Islands and his presentation: “Sharing my loss to protect your data: A story of unexpected data loss and how to do real preservation” was as encouraging to witness as someone new to the profession as it was to all of the other experienced delegates present. As well as sharing practice, it was clear that the community need to be advocating on behalf of each other. It is time and resource consuming but oh-so important.

Digital archives are preserving historical truths. Yes, the majority of the workflow is technological but the objectives and functions are so much more than technology; to just reduce digital preservation down to this is an oversimplification. It was so clear that the range of use cases presented at PASIG were all driven towards documenting social, political, historical information (and preserving that documentation) that will be of absolute necessity for society and infrastructure in future. Right now, for example, Angeline Takewara and her colleagues at UN MICT are working on a digital preservation programme to ensure absolute accountability and usability of the records of the International Criminal Tribunals of both Rwanda and Yugoslavia. I have written a more specific post on Angeline’s presentation here.

Due to the nature of technology and the digital world, the goalposts will always be moving. For example, Somaya Langley’s talk on the future of digital preservation and the mysteries of extracting data from smart devices will soon become (and maybe already is) a reality for those working with accessions of archives or information management. We should, then, embrace change and embrace the unsure and ultimately ‘get over the need for tidiness’ as pointed out by John Sheridan from The National Archives during his presentation “Creating and sustaining a disruptive digital archive” . This is usually counter-intuitive, but as the saying goes, one of the most dangerous phrases to use is ‘we’ve always done it that way’.

The value of digital material outlives the software, so the enabling of prolonged use of software is a real and current issue. Admittedly, this was a factor I had genuinely not even considered before. In my brain I linked obsolescence with hardware and hardware only. Therefore,  Dr. Natasa Milic-Frayling’s presentation on “Aging of Digital: Managed Services for digital continuity” shed much light on the changing computing ecosystem and the gradual aging of software. What I found especially interesting about the proposed software-continuity plan was the transparency of it; the fact that the client can ask to see the software at any time whilst it is being stabilised and maintained.

Thank you so much PASIG 2017 and everybody involved!

One last thing…in closing, Cliff Lynch, CNI, bought up that there was comparably less Web Archiving content this year. If anybody fancies taking a trainee to Mexico next year to do a (lightning) talk on Bodleian Libraries’ Web Archive I am keen…

 

 

Visit to the Parliamentary Archives: Training and business cases

Edith Halvarsson, Policy and Planning Fellow at Bodleian Libraries, writes about the DPOC project’s recent visit to the Parliamentary Archives.


This week the DPOC fellows visited the Parliamentary Archives in London. Thank you very much to Catherine Hardman (Head of Preservation and Access), Chris Fryer (Digital Archivist) and Grace Bell (Digital Preservation Trainee) for having us. Shamefully I have to admit that we have been very slow to make this trip; Chris first invited us to visit all the way back in September last year! However, our tardiness to make our way to Westminster was in the end aptly timed with the completion of year one of the DPOC project and planning for year 2.

Like CUL and Bodleian Libraries, the Parliamentary Archives also first began their own Digital Preservation Project back in 2010. Their project has since transitioned into digital preservation in a more programmatic capacity as of 2015. As CUL and Bodleian Libraries will be beginning to draft business cases for moving from project to programme in year 2; meeting with Chris and Catherine was a good opportunity to talk about how you start making that tricky transition.

Of course, every institution has its own drivers and risks which influence business cases for digital preservation, but there are certain things which will sound familiar to a lot of organisations. For example, what Parliamentary Archives have found over the past seven years, is that advocacy for digital collections and training staff in digital preservation skills is an ongoing activity. Implementing solutions is one thing, whereas maintaining them is another. This, in addition to staff who have received digital preservation training eventually moving on to new institutions, means that you constantly need to stay on top of advocacy and training. Making “the business case” is therefore not a one-off task.

Another central challenge in terms of building business cases, is how you frame digital preservation as a service rather than as “an added burden”. The idea of “seamless preservation” with no human intervention is a very appealing one to already burdened staff, but in reality workflows need to be supervised and maintained. To sell digital preservation, that extra work must therefore be perceived as something which adds value to collection material and the organisation. It is clear that physical preservation adds value to collections, but the argument for digital preservation can be a harder sell.

Catherine had, however, some encouraging comments on how we can attempt to turn advice about digital preservation into something which is perceived as value adding.  Being involved with and talking to staff early on in the design of new project proposals – rather than as an extra add on after processes are already in place – is an example of this.

Image by James Mooney

All in all, it has been a valuable and encouraging visit to the Parliamentary Archives. The DPOC fellows look forward to keeping in touch – particularly to hear more about the great work Parliamentary Archive have been doing to provide digital preservation training to staff!

What is holding us back from change?

There are worse spots for a meeting. Oxford. Photo by: S. Mason

Every 3 months the DPOC teams gets together in person in either Oxford, Cambridge or London (there’s also been talk of taking a meeting at Bletchley Park sometime). As this is a collaborative effort, these meetings offer a rare opportunity to work face-to-face instead of via Skype with the endless issues around screen sharing and poor connections. Good ideas come when we get to sit down together.

As our next joint board meeting is next week, it was important to look over the work of the past year and make sure we are happy with the plan for year two. Most importantly, we wanted to discuss the messages we need to give our institutions as we look towards the sustainability of our digital preservation activities. How do we ensure that the earlier work and the work being done by us does not get repeated in 2-5 years time?

Silos in institutions

This is especially complicated when dealing with institutions like Oxford and Cambridge. We are big and old institutions with teams often working in silos. What does siloing have an effect on? Well, everything. Communication, effort, research—it all suffers. Work done previously is done again. Over and over.

The same problems are being tackled within different silos; this is duplicated and wasted effort if they are not communicating their work to each other. This means that digital preservation efforts can be fractured and imbalanced if institutional collaboration is ignored. We have an opportunity and responsibility in this project to get people together and to get them to talk openly about the digital preservation problems they are each trying to tackle.

Managers need to lead the culture change in the institution

While not always the case, it is important that managers do not just sit back and say “you will never get this to work” or “it has always been this way.” We need them on our side; they after often the gatekeepers of silos. We have to bring them together in order to start opening the silos.

It is within their power to be the agents of change; we have to empower them to believe in changing the habits of our institution. They have to believe that digital preservation is worth it if their team will also.

This might be the ‘carrot and stick’ approach or the ‘carrot’ only, but whatever approach is used, the are a number of points we agreed needed to be made clear:

  • our digital collections are significant and we have made assurances about their preservation and long term access
  • our institutional reputation plays a role in the preservation our digital assets
  • digital preservation is a moving target and we must be moving with it
  • digital preservation will not be “solved” through this project, but we can make a start; it is important that this is not then the end.

Roadmap to sustainable digital preservation

Backing up any messages is the need for a sustainable roadmap. If you want change to succeed and if you want digital preservation to be a core activity, then steps must be actionable and incremental. Find out where you are, where you want to go and then outline the timeline of steps it will take to get there. Consider using maturity models to set goals for your roadmap, such as Kenney and McGovern’s, Brown’s or the NDSA model. Each are slightly different and some might be more suitable for your institutions than others, so have a look at all of them.

It’s like climbing a mountain. I don’t look at the peak as I walk; it’s too far away and too unattainable. Instead, I look at my feet and the nearest landmark. Every landmark I pass is a milestone and I turn my attention to the next one. Sometimes I glance up at the peak, still in the distance—over time it starts to grow closer. And eventually, my landmark is the peak.

It’s only when I get to the top that I see all of the other mountains I also have to climb. And so I find my landmarks and continue on. I consider digital preservation a bit of the same thing.

What are your suggestions for breaking down the silos and getting fractured teams to work together? 

(Mis)Adventures in guest blogging

Sarah shares her recent DPC guest blogging experience. The post is available to read at: http://www.dpconline.org/blog/beware-of-the-leopard-oxford-s-adventures-in-the-bottom-drawer 


As members of the Digital Preservation Coalition (DPC), we have the opportunity to contribute to their blog on issues in digital preservation. As the Outreach & Training Fellow at Oxford, that tasks falls upon me when its our turn to contribute.

You would think that because I contribute to this blog regularly,  I’d be an old hat at blogging. It turns out that writer’s block can hit at precisely the worst possible time. But, I forced out what I could and then turned to the other Fellows at Oxford for support. Edith and James both added their own work to the post.

With a final draft ready, the day approached when we could submit it to the blog. Even the technically-minded struggled with technology now and again. First, it was the challenge of uploading images—it only took about 2 or 3 tries and then I deleted the evidence mistakes. Finally, I clicked ‘submit’ and waited for confirmation.

And I waited…

And got sent back to the homepage. Then I got a ‘failure notice’ email that said “I’m afraid I wasn’t able to deliver your message to the following addresses. This is a permanent error; I’ve given up. Sorry it didn’t work out.” What just happened? Did it work or not?

So I tried again….

And again…

And again.  I think I submitted 6 more times before I emailed to the DPC to ask what I had done wrong. I had done NOTHING wrong, except press ‘submit’ too much. There were as many copies waiting for approval as there were times when I had hit ‘submit’. There was no way to delete the evidence, so I couldn’t avoid that embarrassment.

Minus those technological snafus, everything worked and the DPOC team’s first guest blog post is live! You can read the post here for an Oxford DPOC project update.

Now that I’ve got my technological mistakes out of the way, I think I’m ready to continue contributing to the wider digital preservation community through guest blogging. We are a growing (but still relatively small) community and sharing our knowledge, ideas and experiences freely through blogs is important. We rely on each other to navigate the field where things can be complex and ever-changing. Journals and project websites date quickly, but community-driven and non-profit blogs remain a good source of relevant and immediate information. They are valuable part of my digital preservation work and I am happy to be giving back.

 

Outreach and Training Fellows visit CoSector, University of London

Outreach & Training Fellow, Lee, chronicles his visit with Sarah to meet CoSector’s Steph Taylor and Ed Pinsent.


On Wednesday 29 March, a date forever to be associated with the UK triggering of Article 50, Sarah and Lee met with CoSector’s Stephanie Taylor and Ed Pinsent in the spirit of co-operation. For those that don’t know, Steph and Ed are behind the award-winning Digital Preservation Training  Programme.

Russell Square was overcast but it was great to see that London was still business as usual with its hallmark traffic congestion and bus loads of sightseers lapping up the cultural hotspots. Revisiting the University of London’s Senate House is always a visual pleasure and it’s easy to see why it was home to the Ministry of Information: the building screams order and neat filing.

Senate House, University of London

Senate House, University of London. Image credit: By stevecadman – http://www.flickr.com/photos/stevecadman/56350347/, CC BY-SA 2.0, https://commons.wikimedia.org/w/index.php?curid=6400009

We were keen to speak to Steph and Ed to tell them more about the DPOC Project to date and where we were at with training developments. Similarly, we were also keen to learn about the latest developments from CoSector’s training plans and we were interested to hear that CoSector will be developing their courses into more specialist areas of digital preservation, so watch this space… (well at least, the CoSector space).

It was a useful meeting because it gave us the opportunity to get instant feedback on the way the project is working and where we could help to feed into current training and development needs. In particular, they were really interested to learn about the relationship between the project team and IT. Sarah and I feel that because we have access to two technical IT experts who are on board and happy to answer our questions—however simple they may be from an IT point of view—we feel that it is easier to understand IT issues. Similarly, we find that we have better conversations with our colleagues who are Developers and Operations IT specialists because we have a linguistic IT bridge with our technical colleagues.

It was a good learning opportunity and we hope to build upon this first meeting in the future as a part of sustainable training solution.

Visit to the National Archives: herons and brutalism

An update from Edith Halvarsson about the DPOC team’s trip to visit the National Archives last week. Prepare yourself for a discussion about digital preservation, PRONOM, dark archives, and wildlife!


Last Thursday DPOC visited the National Archives in London. David Clipsham kindly put much time into organising a day of presentations with the TNA’s developers, digitization experts and digital archivists. Thank you Diana, David & David, Ron, Ian & Ian, Anna and Alex for all your time and interesting thoughts!

After some confusion, we finally arrived at the picturesque Kew Gardens station. The area around Kew is very sleepy, and our first thought on arrival was “is this really the right place?” However, after a bit more circling around Kew, you definitely cannot miss it. The TNA is located in an imposing brutalist building, surrounded by beautiful nature and ponds built as flood protection for the nation’s collections. They even have a tame heron!

After we all made it on site, the day the kicked off with an introduction from Diana Newton (Head of Digital Preservation). Diana told us enthusiastically about the history of the TNA and its Digital Records Infrastructure. It was really interesting to hear how much has changed in just six years since DRI was launched – both in terms of file format proliferation and an increase in FOI requests.

We then had a look at TNA’s ingest workflows into Preservica and storage model with Ian Hoyle (Senior Developer) and David Underdown (Senior Digital Archivist). It was particularly interesting to hear about the TNA’s decision to store all master file content on offline tape, in order to bring down the archive’s carbon footprint.

After lunch with Ron Davies (Senior Project Manager), Anna de Sousa and Ian Henderson spoke to us about their work digitizing audiovisual material and 2D images. Much of our discussion focused on standards and formats (particularly around A/V). Alex Green and David Clipsham then finished off the day talking about born-digital archive accession streams and PRONOM/DROID developments. This was the first time we had seen the clever way a file format identifier is created – there is much detective work required on David’s side. David also encouraged us and anyone else who relies on DROID to have a go and submit something to PRONOM – he even promised its fun! Why not read Jenny Mitcham’s and Andrea Byrne’s articles for some inspiration?

Thanks for a fantastic visit and some brilliant discussions on how digital preservation work and digital collecting is done at the TNA!

IDCC 2017 – data champions among us

Outreach and Training Fellow, Sarah, provides some insight into some of the themes from the recent IDCC conference in Edinburgh on the 21 – 22 February. The DPOC team also presented their first poster,”Parallel Auditing of the University of Cambridge and the University of Oxford’s Institutional Repositories,” which is available on the ‘Resource’ page.


Storm Doris waited to hit until after the main International Digital Curation Conference (IDCC) had ended, allowing for two days of great speakers. The conference focused on research data management (RDM) and sharing data. In Kevin Ashley’s wrap-up, he touched on data champions and the possibilities of data sharing as two of the many emerging themes from IDCC.

Getting researchers to commit to good data practice and then publish data for reuse is not easy. Many talks focused around training and engagement of researchers to improve their data management practice. Marta Teperek and Rosie Higman from Cambridge University Library (CUL) gave excellent talks on engaging their research community in RDM. Teperek found value in going to the community in a bottom-up, research led approach. It was time-intensive, but allowed the RDM team at CUL to understand the problems Cambridge researchers faced and address them. A top-down, policy driven approach was also used, but it has been a combination of the two that has been the most effective for CUL.

Higman went on to speak about the data champions initiative. Data champions were recruited from students, post-doctoral researchers, administrators and lecturers. What they had in common was their willingness to advocate for good RDM practices. Each of the 41 data champions was responsible for at least one training session year. While the data champions did not always do what the team expected, their advocacy for good RDM practice has been invaluable. Researchers need strong advocates to see the value in publishing their data – it is not just about complying with policy.

On day two, I heard from researcher and data champion Dr. Niamh Moore from University of Edinburgh. Moore finds that many researchers either think archiving their data is either a waste of time or are concerned about the future use of their data. As a data champion, she believes that research data is worth sharing and thinks other researchers should be asking,  ‘how can I make my data flourish?’. Moore uses Omeka to share her research data from her mid-90s project at the Clayoquot Sound peace camp called Clayoquot Lives. For Moore, benefits to sharing research data include:

  • using it as a teaching resource for undergraduates (getting them to play with data, which many do not have a chance to do);
  • public engagement impact (for Moore it was an opportunity to engage with the people previously interviewed at Clayoquot); and
  • new articles: creating new relationships and new research where she can reuse her own data in new ways or other academics can as well.

Opening up data and archiving leads to new possibilities. The closing keynote on day one discussed the possibilities of using data to improve the visitor experience for people at the British Museum. Data Scientist, Alice Daish, spoke of data as the unloved superhero. It can rescue organisations from questions and problems by providing answers, helping organisations make decisions, take actions and even provide more questions. For example, Daish has been able to wrangle and utilise data at the British Museum to learn about the most popular collection items on display (the Rosetta Stone came first!).

And Daish, like Teperek and Higman, touched on outreach as the only way to advocate for data – creating good data, sharing it, and using it to its fullest potential. And for the DPOC team, we welcome this advocacy; and we’d like to add to it and see that steps are also made to preserve this data.

Also, it was a great to talk about the work we have been doing and the next steps for the project—thanks to everyone who stopped by our poster!

Oxford Fellows (From left: Sarah, Edith, James) holding the DPOC poster out front of the appropriately named “Fellows Entrance” at the Royal College of Surgeons.

Polonsky Fellows visit Western Bank Library at Sheffield University

Overview of DPOC’s visit to the Western Bank Library at Sheffield University by James Mooney, Technical Fellow at Bodleian Libraries, Oxford.
___________________________________________________________________________

The Polonsky Fellows were invited to the Western Bank Library at Sheffield University to speak with Laura Peaurt and other members of the Library. The aim of the meeting was to discuss the experiences of using and implementing Ex Libris’ Rosetta product.

After arriving by train, it was just a quick tram ride to Western Bank campus at Sheffield University, then we had the fun of using the paternoster lift in the Western Bank Library to arrive at our meeting, it’s great to see this technology has been preserved and still in use.

Paternoster lifts still in use at the Western Library. Image Credit: James Mooney

We met with Laura Peaurt (Digital Preservation Manager), Chris Jones (Library Systems Manager) and Angus Taggart (Library Systems Manager – Research).

Andy Bussey, Head of Digital Services & Systems was kind enough to give us an hour of his time at the start of the meeting, allowing us to discuss parts of the procurement and implementation process.

When working out the requirements for the system, Sheffield was able to collaborate with the White Rose University Consortium (the Universities of Leeds, Sheffield and York) to work out an initial scope.

When reviewing the options both open source and proprietary products were considered. For the Western Library and the University back in 2014, after a skills audit, the open source options had to be ruled out due to a lack of technical and developmental skills to customise or support them. I’m sure if this was revisited today the outcome may well have been different as the team has grown and gained experience and expertise. Many organisations may find it easier to budget for a software package and support contract with a vendor than to pursue the creation of several new employment positions.

With that said, as part of the implementation of Rosetta, Laura’s role was created as there was an obvious need for a Digital Preservation manager, we then went on to discuss the timeframe of the project and then moved onto the configuration of the product with Laura providing a live demonstration of the product whilst talking about the current setup, the scalability of the instances and the granularity of the sections within Rosetta.

During the demonstrations we discussed what content was held in Rosetta, how people had been trained with Rosetta and what feedback they had received so far. We reviewed the associated metadata which had been stored with the items that had been ingested and went over the options regarding integration with a Catalogue and/or Archival Management System.

After lunch we went on discuss the workflows currently being used with further demonstrations so we could see an end-to-end examples including what ingest rules and polices were in place along with what tools were in use and what processes were carried out. We then looked at how problematic items were dealt with in the Technical Analysis Workbench, covering the common issues and how additional steps in the ingest process can minimise certain issues.

As part of reviewing the sections of Rosetta we also inspected of Rosetta’s metadata model, the DNX (Digital Normalised XML) and discussed ingesting born-digital content and associated METS files.

Western Library. Image Credit: A J Buildings Library.

We visited Sheffield with many questions and during the course of the discussions throughout the day many of these were answered but as the day came to a close we had to wrap up the talks and head back to the train station. We all agreed it had been an invaluable meeting and sparked further areas of discussion. Having met face to face and with an understanding of the environment at Sheffield will make future conversations that much easier.

DPOC visits the Wellcome Library in London

A brief summary by Edith Halvarsson, Policy and Planning Fellow at the Bodleian Libraries, of the DPOC project’s recent visit to the Wellcome Library.
___________________________________________________________________________

Last Friday the Polonsky Fellows had the pleasure of spending a day with Rioghnach Ahern and David Thompson at the Wellcome Library. With a collection of over 28.6 million digitized images, the Wellcome is a great source of knowledge and experience in working with digitisation at a large scale. Themes of the day centred around pragmatic choices, achieving consistency across time and scale, and horizon scanning for emerging trends.

The morning started with an induction from Christy Henshaw, the Wellcome’s Digital Production Manager. We discussed digitisation collection development and Jpeg2000 profiles, but also future directions for the library’s digitised collection. One point which particularly stood out to me, was changes in user requirements around delivery of digitised collections. The Wellcome has found that researchers are increasingly requesting delivery of material for “use as data”. (As a side note: this is something which the Bodleian Libraries have previously explored in their Blockbooks project, which used facial recognition algorithms traditionally associated with security systems, to trace provenance of dispersed manuscripts). As the possibilities for large scale analysis using these types of algorithms multiply, the Wellcome is considering how delivery will need to change to accommodate new scholarly research methods.

Somay_Wellcome_20170120

Brain teaser: Spot the Odd One Out (or is it a trick question?). Image credit: Somaya Langley

Following Christy’s talk we were given a tour of the digitization studios by Laurie Auchterlonie. Laurie was in the process of digitising recipe books for the Wellcome Library’s Recipe Book Project. He told us about some less appetising recipes from the collection (such as three-headed pig soup, and puppy dishes) and about the practical issues of photographing content in a studio located on top of one of the busiest underground lines in London!

After lunch with David and Rioghnach at the staff café, we spent the rest of the afternoon looking at Goobi plug-ins, Preservica and the Wellcome’s hybrid-cloud storage model. Despite talking digitisation – metadata was a reoccurring topic in several of the presentations. Descriptive metadata is particularly challenging to manage as it tends to be a work in progress – always possible to improve and correct. This creates a tension between curators and cataloguers doing their work, and the inclination to store metadata together with digital objects in preservation systems to avoid orphaning files. Wellcome’s solution has been to articulate their three core cataloguing systems as the canonical bibliographic source, while allowing potentially out of data metadata to travel with objects in both Goobi and Preservica for in-house use only. As long as there is clarity around which is the canonical metadata record, these inconsistencies are not problematic to the library. ‘You would be surprised how many institutions have not made a decision around which their definitive bibliographic records is’, says David.

Dave_thomson_20170120

Presentation on the Wellcome Library’s digitisation infrastructure. Image credit: Somaya Langley

The last hour was spent pondering the future of digital preservation and I found the conversations very inspiring and uplifting. As we work with the long-term in mind, it is invaluable to have these chances to get out of our local context and discuss wider trends with other professionals. Themes included: digital preservation as part of archival masters courses, cloud storage and virtualisation, and the move from repository software to dispersed micro-services.

The fellow’s field trip to the Wellcome is one of a number of visits that DPOC will make during 2017 talk to institutions around the UK about their work around digital preservation. Watch www.dpoc.ac.uk for more updates.