Polonsky Fellows visit Western Bank Library at Sheffield University

Overview of DPOC’s visit to the Western Bank Library at Sheffield University by James Mooney, Technical Fellow at Bodleian Libraries, Oxford.
___________________________________________________________________________

The Polonsky Fellows were invited to the Western Bank Library at Sheffield University to speak with Laura Peaurt and other members of the Library. The aim of the meeting was to discuss the experiences of using and implementing Ex Libris’ Rosetta product.

After arriving by train, it was just a quick tram ride to Western Bank campus at Sheffield University, then we had the fun of using the paternoster lift in the Western Bank Library to arrive at our meeting, it’s great to see this technology has been preserved and still in use.

Paternoster lifts still in use at the Western Library. Image Credit: James Mooney

We met with Laura Peaurt (Digital Preservation Manager), Chris Jones (Library Systems Manager) and Angus Taggart (Library Systems Manager – Research).

Andy Bussey, Head of Digital Services & Systems was kind enough to give us an hour of his time at the start of the meeting, allowing us to discuss parts of the procurement and implementation process.

When working out the requirements for the system, Sheffield was able to collaborate with the White Rose University Consortium (the Universities of Leeds, Sheffield and York) to work out an initial scope.

When reviewing the options both open source and proprietary products were considered. For the Western Library and the University back in 2014, after a skills audit, the open source options had to be ruled out due to a lack of technical and developmental skills to customise or support them. I’m sure if this was revisited today the outcome may well have been different as the team has grown and gained experience and expertise. Many organisations may find it easier to budget for a software package and support contract with a vendor than to pursue the creation of several new employment positions.

With that said, as part of the implementation of Rosetta, Laura’s role was created as there was an obvious need for a Digital Preservation manager, we then went on to discuss the timeframe of the project and then moved onto the configuration of the product with Laura providing a live demonstration of the product whilst talking about the current setup, the scalability of the instances and the granularity of the sections within Rosetta.

During the demonstrations we discussed what content was held in Rosetta, how people had been trained with Rosetta and what feedback they had received so far. We reviewed the associated metadata which had been stored with the items that had been ingested and went over the options regarding integration with a Catalogue and/or Archival Management System.

After lunch we went on discuss the workflows currently being used with further demonstrations so we could see an end-to-end examples including what ingest rules and polices were in place along with what tools were in use and what processes were carried out. We then looked at how problematic items were dealt with in the Technical Analysis Workbench, covering the common issues and how additional steps in the ingest process can minimise certain issues.

As part of reviewing the sections of Rosetta we also inspected of Rosetta’s metadata model, the DNX (Digital Normalised XML) and discussed ingesting born-digital content and associated METS files.

Western Library. Image Credit: A J Buildings Library.

We visited Sheffield with many questions and during the course of the discussions throughout the day many of these were answered but as the day came to a close we had to wrap up the talks and head back to the train station. We all agreed it had been an invaluable meeting and sparked further areas of discussion. Having met face to face and with an understanding of the environment at Sheffield will make future conversations that much easier.

DPOC visits the Wellcome Library in London

A brief summary by Edith Halvarsson, Policy and Planning Fellow at the Bodleian Libraries, of the DPOC project’s recent visit to the Wellcome Library.
___________________________________________________________________________

Last Friday the Polonsky Fellows had the pleasure of spending a day with Rioghnach Ahern and David Thompson at the Wellcome Library. With a collection of over 28.6 million digitized images, the Wellcome is a great source of knowledge and experience in working with digitisation at a large scale. Themes of the day centred around pragmatic choices, achieving consistency across time and scale, and horizon scanning for emerging trends.

The morning started with an induction from Christy Henshaw, the Wellcome’s Digital Production Manager. We discussed digitisation collection development and Jpeg2000 profiles, but also future directions for the library’s digitised collection. One point which particularly stood out to me, was changes in user requirements around delivery of digitised collections. The Wellcome has found that researchers are increasingly requesting delivery of material for “use as data”. (As a side note: this is something which the Bodleian Libraries have previously explored in their Blockbooks project, which used facial recognition algorithms traditionally associated with security systems, to trace provenance of dispersed manuscripts). As the possibilities for large scale analysis using these types of algorithms multiply, the Wellcome is considering how delivery will need to change to accommodate new scholarly research methods.

Somay_Wellcome_20170120

Brain teaser: Spot the Odd One Out (or is it a trick question?). Image credit: Somaya Langley

Following Christy’s talk we were given a tour of the digitization studios by Laurie Auchterlonie. Laurie was in the process of digitising recipe books for the Wellcome Library’s Recipe Book Project. He told us about some less appetising recipes from the collection (such as three-headed pig soup, and puppy dishes) and about the practical issues of photographing content in a studio located on top of one of the busiest underground lines in London!

After lunch with David and Rioghnach at the staff café, we spent the rest of the afternoon looking at Goobi plug-ins, Preservica and the Wellcome’s hybrid-cloud storage model. Despite talking digitisation – metadata was a reoccurring topic in several of the presentations. Descriptive metadata is particularly challenging to manage as it tends to be a work in progress – always possible to improve and correct. This creates a tension between curators and cataloguers doing their work, and the inclination to store metadata together with digital objects in preservation systems to avoid orphaning files. Wellcome’s solution has been to articulate their three core cataloguing systems as the canonical bibliographic source, while allowing potentially out of data metadata to travel with objects in both Goobi and Preservica for in-house use only. As long as there is clarity around which is the canonical metadata record, these inconsistencies are not problematic to the library. ‘You would be surprised how many institutions have not made a decision around which their definitive bibliographic records is’, says David.

Dave_thomson_20170120

Presentation on the Wellcome Library’s digitisation infrastructure. Image credit: Somaya Langley

The last hour was spent pondering the future of digital preservation and I found the conversations very inspiring and uplifting. As we work with the long-term in mind, it is invaluable to have these chances to get out of our local context and discuss wider trends with other professionals. Themes included: digital preservation as part of archival masters courses, cloud storage and virtualisation, and the move from repository software to dispersed micro-services.

The fellow’s field trip to the Wellcome is one of a number of visits that DPOC will make during 2017 talk to institutions around the UK about their work around digital preservation. Watch www.dpoc.ac.uk for more updates.

Audiovisual creation and preservation

Following on from the well received Filling the digital preservation gap(s) post, Somaya has followed this up by reflecting on an in-house workshop she recently attended entitled, ‘Video Production: Shoot, Edit and Upload’, which has prompted these thoughts and some practical advice on analogue and digital audiovisual preservation.


My photographer colleague, Maciej, and I attended a video editing course at Cambridge University. I was there to learn about what video file formats staff at the University are creating and where these are being stored and made available, with a view to future preservation of this type of digital content. It is important we know what types of content the university is creating, so we know what we will have to preserve now and in the future.

While I have an audio background (having started out splicing reel-to-reel tapes), for the past 20 years I have predominantly worked in the digital domain. I am not an analogue audiovisual specialist, particularly not film and video. However, I have previously worked for an Australian national broadcaster (in the radio division) and the National Film and Sound Archive of Australia (developing a strategy for acquiring and preserving multi-platform content, such as Apps and interactive audiovisual works etc.)

AV Media

A range of analogue and digital carriers. Image credit: Somaya Langley

Since my arrival, both Cambridge University Library and Bodleian Libraries, Oxford have been very keen to discuss their audiovisual collections and I’m led to believe there may be some significant film collections held in Cambridge University Library (although, I’ve yet to see them in person). As many people have been asking about audiovisual, I thought I would briefly share some information (from an Australiasian perspective).

A ten-year deadline for audiovisual digitisation

In 2015, the National Film and Sound Archive of Australia launched a strategy paper called Deadline 2025: collections at risk which outlines why there is a ten-year deadline to digitise analogue (or digital tape-based) audiovisual material. This is due to the fragility of the carriers (the reels, tapes etc.), playback equipment having been discontinued – a considerable proportion of equipment purchased is secondhand and bought via eBay or similar services – as well as the specialist skills also disappearing. The knowledge of analogue audiovisual held by engineers of this era is considerable. These engineers have started to retire, and while there is some succession planning, there is not nearly enough to retain the in-depth, wide-ranging and highly technical skill-sets and knowledge of engineers trained last century.

Obsolete physical carriers

Why is it that audio and video content requires extra attention? There is a considerable amount of specialist knowledge that is required to understand how carriers are best handled. In the same way that conservation staff know how to repair delicate hundreds of years old paper or paintings, similar knowledge is required to handle audiovisual carriers such as magnetic tape (cassettes, reel-to-reel tapes) or optical media (CDs, DVDs etc.) Not having the proper knowledge of how to wind tapes, when a tape requires ‘baking’ or holding a CD in a certain way can result in damage to the carrier. Further information on handling carriers can be found here: http://www.iasa-web.org/tc05/handling-storage-audio-video-carriers. If you’re struggling to identify an audiovisual or digital carrier, then Mediapedia (a resource initiated by Douglas Elford at the National Library of Australia) is a great starting point.

Earlier this year, along with former State Library of New South Wales colleagues in Sydney, Scott Wajon and Damien Cassidy, we produced an Obsolete Physical Carriers Report based on a survey of audiovisual and digital carriers held in nine Australian libraries for the National and State Libraries Australasia (NSLA). This outlined the scope of the problem of ‘at-risk’ content held on analogue and digital carriers (and that this content needs to be transferred within the next decade). Of note is the short lifespan of ‘burnt’ (as opposed to professionally mastered) CDs and DVDs.

Audio preservation standards

In 2004, the International Association of Sound and Audiovisual Archives (IASA) first published the audio preservation standard: Guidelines on the Production and Preservation of Digital Audio Objects. I have been lucky to have worked with the editor (Kevin Bradley from the National Library of Australia) and several of the main contributors (including Matthew Davies) in some of my previous roles. This sets a standard for the quality.

Other standards publications IASA has produced can be found here: http://www.iasa-web.org/iasa-publications

Video preservation standards

Since approximately 2010, IASA has been working towards publishing a similar standard for video preservation. While this has yet to be released, it is likely to be soon (hopefully 2017?).

In lieu of a world-wide standard for video

As audiovisual institutions around the world are digitising their film and video collections, they are developing their own internal guidelines and procedures regarding ‘preservation quality’ video, however best-practice has started to form with many choosing to use:

  • Lossless Motion JPEG 2000, inside an MXF OP1a wrapper

There is also interest in another CODEC as a possible video preservation standard, which is being discussed by various audiovisual preservation specialists as a possible alternative:

  • Lossless FFV1 (FF Video Codec 1)

For content that has been captured at a lower quality in the first place (e.g. video created with consumer rather than professional equipment), another format various collecting institutions may consider is:

  • Uncompressed AVI

Why is video tricky?

For the most part, video is more complex than audio for several reasons including:

  • A video file format may not be what it seems – there is both a container (aka wrapper) holding inside it the video file (e.g. Quicktime MOV file containing content encoded as H.264).
  • Video codecs can also produce files that are lossy (compressed with a loss of information) or lossless (compressed, but where data is not lost as part of the encoding process).

The tool, MediaInfo, can provide information about both the container and the encoded file for a wide range of file formats.

Of course, there are many things to consider and parameters to configure – hence needing film and video digitisation specialists and specialist equipment to produce preservation quality digitised video.

From the US, the Federal Agencies Digitization Guide Initiative (FADGI) are also a great resource for information about audiovisual digitisation.

Consumer-produced audiovisual content

While I would recommend that consumers capture and produce as high-quality audiovisual content as their equipment allows (minimum of 24bit, 48kHz WAV files for audio and uncompressed AVI for video), I’m aware those using mobile devices aren’t necessarily going to do this. So, in addition to ensuring, where possible, preservation quality audiovisual content is created now and in the future, we will also have to take into account significant content being created on non-professional consumer-grade equipment and the potential proprietary file formats produced.

What can you do?

If you’re creating audio and or video content:

  • set your settings on your device to the highest quality it will allow (however you will need to take into account the amount of storage this will require)
  • try to avoid proprietary and less common file formats and CODECs
  • be aware that, especially for video content, your file is a little more complex than you might have expected: it’s a ‘file’ inside a ‘wrapper’, so it’s almost like two files, one inside the other…

How big?

Another consideration are the file sizes of digitised and born-digital film and video content which has implications for how to ‘wrangle’ files as well as considerable storage needed … however this is best left for a future blog post.

We will discuss more about born-digital audiovisual content and considerations as the DPOC project progresses.

Introducing Digital Preservation at Oxford and Cambridge

1 August 2016 marks the beginning of a two-year collaborative project between Cambridge University Library (Cambridge) and University of Oxford’s Bodleian Libraries (Oxford). This project has been funded to assess current practices, then design and implement best practice digital preservation programmes at each institution.

Continue reading