Update on the training programme pilot

Sarah, Oxford’s Outreach and Training Fellow, has been busy since the new year designing and a running a digital preservation training programme pilot in Oxford. It consisted of one introductory course on digital preservation and six other workshops. Below is an update on what she did for the pilot and what she has learnt over the past few months.


It’s been a busy few months for me, so I have been quiet on the blog. Most of my time and creative energy has been spent working on this training programme pilot. In total, there were seven courses and over 15 hours of material. In the end, I trialled the courses on over 157 people from Bodleian Libraries and the various Oxford college libraries and archives. Many attendees were repeats, but some were not.

The trial gave me an opportunity to test out different ideas and various topics. Attendees were good at giving feedback, both during the course and after via an online survey. It’s provided me with further ideas and given me the chance to see what works or what doesn’t. I’ve been able to improve the experience each time, but there’s still more work to be done. However, I’ve already learned a lot about digital preservation and teaching.

Below are some of the most important lessons I’ve learned from the training programme pilot.

Time: You always need more

I found that I almost always ran out of time at the end of a course; it left no time for questions or to finish that last demo. Most of my courses could have either benefited from less content, shorter exercises, or just being 30 minutes longer.

Based on feedback from attendees, I’ll be making adjustments to every course. Some will be longer. Some will have shorter exercises with more optional components and some will have slightly less content.

While you might budget 20 minutes for an activity, you will likely use 5-10 minutes more. But it varies every time due to the attendees. Some might have a lot of questions, but others will be quieter. It’s almost better to overestimate the time and end early than rush to cover everythhing. People need a chance to process the information you give them.

Facilitation: You can’t go it alone

In only one of my courses did I have to facilitate alone. I was run off my feet for the 2 hours because it was just me answering questions during  exercises for 15 attendees. It doesn’t sound like a lot, but I had a hoarse voice by the end from speaking for almost 2 hours!

Always get help with facilitation—especially for workshops. Someone to help:

  • answer questions during exercises,
  • get some of the group idea exercises/conversations started,
  • make extra photocopies or print outs, and
  • load programs and files onto computers—and then help delete them after.

It is possible to run training courses alone, but having an extra person makes things run smoother and saves a lot of time. Edith and James have been invaluable support!

Demos: Worth it, but things often go wrong

Demos were vital to illustrate concepts, but they were also sometimes clunky and time consuming to manage. I wrote up demo sheets to help. The demos relied on software or the Internet—both which can and will go wrong. Patience is key; so is accepting that sometimes things will not go right. Processes might take a long time to run or the course concludes before the demo is over.

The more you practice on the computer you will be using, the more likely things will go right. But that’s not always an option. If it isn’t, always have a back up plan. Or just apologise, explain what should have happened and move on. Attendees are generally forgiving and sometimes it can be turned into a really good teaching moment.

Exercises: Optional is the way to go

Unless you put out a questionnaire beforehand, it is incredibly hard to judge the skill level of your attendees. It’s best to prepare for all levels. Start each exercise slow and have a lot of optional work built in for people that work faster.

In most of my courses I was too ambitious for the time allowed. I wanted them to learn and try everything. Sometimes I wasn’t asking the right questions on the exercises either. Testing exercises and timing people is the only way to tailor them. Now that I have run the workshops and seen the exercises in action, I have a clearer picture of what I want people to learn and accomplish—now I just have to make the changes.

Future plans

There were courses I would love to run in the future (like data visualisation and digital forensics), but I did not have the time to develop. I’d like to place them on a roadmap for future training. As well as reaching out more to the Oxford colleges, museums and other departments. I would also like to tailor the introductory course a bit more for different audiences.

I’d like to get involved with developing courses like Digital Preservation Carpentry that the University of Melbourne is working on. The hands-on workshops excited and challenged me the most. Not only did others learn a lot, but so did I. I would like to build on that.

At the end of this pilot, I have seven courses that I will finalise and make available through a creative commons licence. What I learned when trying to develop these courses is that there isn’t always a lot of good templates available on the Internet to use as a starting point—you have to ask around for people willing to share.

So, I am hoping to take the work that I’ve done and share it with the digital preservation community. I hope they will be useful resources that can be reused and repurposed. Or at the very least, I hope it can be used as a starting point for inspiration (basic speakers notes included).

These will be available via the DPOC website sometime this summer, once I have been able to make the changes necessary to the slides and exercises—along with course guidance material. It has been a rewarding experience (as well as an exhausting one); I look forward to developing and delivering more digital preservation training in the future.

Breaking through with Library Carpentry

Thursday 11th January saw the Cambridge University Library’s annual conference take place. This year, it was entitled ‘Breakthrough the Library’, and focused on cutting-edge innovation in libraries and archives. I can honestly say that this was the first ever conference I’ve been to where every single speaker I saw (including the ten or so who gave lightning talks) were absolutely excellent.

So it’s hard to pick the one that made the most impression. Of course, an honourable mention must go to the talk about Jasper the three legged cat, but if I had to plump for the one that was most pertinent to moving Digital Preservation forward, I’d have picked “Library Carpentry: software and data skills for librarian professionals”, from Dr James Baker of the University of Sussex.

I’d heard of the term ‘Library Carpentry’ (and the initiatives it stems from – Software Carpentry and Data Carpentry) and thus had an idea what the talk was about on the way in. Their web presence explains things far better than I can, too (see https://librarycarpentry.github.io/), so I’m going to skip the exposition and make a different point…

As a full-blown, time-served nerd who’s clearly been embittered by 20 years in the IT profession (though I’m pleased to report, not as much as most of my long-term friends and colleagues!), I went into the talk with a bit of a pessimistic outlook. This was because, in my experience, there are three stages one passes through when learning IT skills:

  • Stage 1: I know nothing. This computer is a bit weird and confuses me.
  • Stage 2: I know EVERYTHING. I can make this computer sing and dance, and now I have the power to conquer the world.
  • Stage 3: … er – hang on… The computer might not have been doing exactly what I thought it was, after all… Ooops! What did I just do?

Stage 1 is just something you get through (if you want – I have nothing but respect for happy Stage 1 dwellers, though). If so inclined, all it really takes is a bit of persistence and a dollop of enthusiasm to get through it. If you want to but think you might struggle, then have a go at this computer programming aptitude test from the University of Kent – you may be pleasantly surprised… In my own case, I got stuck there for quite a while until one day a whole pile of O Level algebra that was lurking in my brain suddenly rose out of the murk, and that was that.

Stage 2 people, on the other hand, tend to be really dangerous… I have personally worked with quite a few well-paid developers who are stuck in Stage 2, and they tend to be the ones who drop all the bombs on your system. So the faster you can get through to Stage 3, the better. This was at the root of my concern, as one of the ideas of Library Carpentry is to pick up skills quick, and then pass them on. But I needn’t have worried because…

When I asked Dr Baker about this issue, he reassured me that ‘questioning whether the computer has done what you expected’ is a core learning point that is central to Library Carpentry, too. He also declared the following (which I’m going to steal): “I make a point of only ever working with people with Impostor Syndrome”.

Hence it really does look as if getting to Stage 3 without even going through Stage 2 at all is what Library Carpentry is all about. I believe moves are afoot to get some of this good stuff going at Cambridge… I watch with interest and might even be able to find the time to join in..? I bet it’ll be fun.

Towards a common understanding?

Cambridge Outreach and Training Fellow, Lee, describes the rationale behind trialling a recent workshop on archival science for developers, as well as reflecting on the workshop itself. Its aim was to get those all those working in digital preservation within the organisation to have a better understanding of each other’s work to improve co-operation for a sustainable digital preservation effort.


Quite often, there is a perceived language barrier due to the wide range of practitioners that work in digital preservation. We may be using the same words, but there’s not always a shared common understanding of what they mean. This became clear when I was sitting next to my colleague, a systems integration manager, at an Archivematica workshop in September. Whilst not a member of the core Cambridge DPOC team, our colleague is a key member of our extended digital preservation network at Cambridge University Library a is a key member for development for understanding and retaining digital preservation knowledge in the institution.

For those from a recordkeeping background, the design principles behind the front end of Archivematica should be obvious, as it incorporates both traditional principles of archival practice and features of the OAIS model. However, coming from a systems integration point of view, there was a need to have to translate for my colleague words such as ‘accession’, ‘appraisal’ and ‘arrangement’, which many of us with archival education take their meanings for granted.

I asked my colleague if an introductory workshop on archival science would be useful, and she said, “yes, please!” Thus, the workshop was born. Last week, a two and a half hour workshop was trialled for members of our developer and systems integration colleagues. The aim of the workshop was to enable them to understand what archivists are taught on postgraduate courses and how this teaching informs their practice. After understanding the attendees’ impressions of an archivist and the things that they do (see image) the workshop then practically explored how an archivist would acquire and describe a collection. The workshop was based on an imaginary company, complete with a history and description of the business units and examples of potential records they would deposit. There were practical exercises on making an accession record, appraising a collection, artificial arrangement and subsequent description through ISAD(G).

Sticky notes about archivists

Sticky notes about archivists from a developer point of view.

Having then seen how an archivist would approach a collection, the workshop moved into explaining physical storage and preservation before moving onto digital preservation, specifically looking at OAIS and then examples of digital preservation software systems. One exercise was to get the attendees to use what they had learned in the workshop to see where archival ideas mapped onto the systems.

The workshop tried to demonstrate how archivists have approached digital preservation armed with the professional skills and knowledge that they have. The idea was to inform to teams working with archivists and the digital preservation of how archivists think and how and why some of the tools and products are design in the way that they are. My hope was for ‘IT’ to understand the depth of knowledge that archivists have in order to help everyone work together on a collaborative digital preservation solution.

Feedback was positive and it will be run again in the New Year. Similarly, I’m hoping to devise a course from a developer perspective that will help archivists communicate more effectively with developers. Ultimately, both will be working from a better level of understanding each other’s professional skill sets. Co-operation and collaboration on digital preservation projects will become much easier across disciplines and we’ll have a better informed (and relaxed) environment to share practices and thoughts.

Transcribing interviews

The second instalment of Lee’s experience running a skills audit at Cambridge University Library. He explains what is needed to be able to transcribe the lengthy and informative interviews with staff.


There’s no ground-breaking digital preservation goodness contained within this post so you have permission to leave this page now. However, this groundwork is crucial to gaining an understanding of how institutions can prepare for digital preservation skills and knowledge development. It may also be useful to anyone who is preparing to transcribe recorded interviews.

Post-interview: transcribing the recording

Once you have interviewed your candidates and made sure that you have all the recordings (suitably backed up three times into private, network free storage like an encrypted USB stick so as to respect privacy wishes), it is time to transcribe.

So, what do you need?

  • A very quiet room. Preferably silence, where there are no distractions and where you can’t distract people. You may wish to choose the dictation path and if you do that in an open plan office, you may attract attention. You will also be reciting information that you have assured will remain confidential.
  • Audio equipment. You will need a device that can play your audio files and has an audio control player built into it. You can use your device’s speakers, headphones, preferably with a control device built into the wire, or foot pedal.
  • Time. Bucket loads of it. If you are doing other work, this needs to become the big rock in your time planning, everything else should be mere pebbles and sand. This is where manager support is really helpful, as is…
  • Understanding. The understanding that this will rule your working life for the next month or two and the understanding of those around the size of the task of what you are doing. To have an advocate who has experience of this type of work before is invaluable.
  • Patience. Of a saint.
  • Simple transcription rules. Given the timeframes of the project, complex transcription would have been too time consuming. Please see the following work below, as used by the University of California, San Diego, it’s really useful with nice big text.
    Dresing, Thorsten/Pehl, Thorsten/Schmieder, Christian (2015): Manual (on) Transcription. Transcription Conventions, Software Guides and Practical Hints for Qualitative Researchers. 3rd English Edition. Marburg Available Online: http://www.audiotranskription.de/english/transcription-practicalguide.htm
    (Last accessed: 27.06.2017). ISBN: 978-3-8185-0497-7.

Cropped view of person hands typing on laptop computer. Image credit: Designed by Freepik

What did you do?

Using a Mac environment, I imported the audio files for transcription into a desktop folder and created a play list in iTunes. I reduced the iTunes application to the mini player view and opened up Word to type into. I plugged in my headphones and pressed play and typed as I was listening.

If you get tired typing, the Word application on my Mac has a nifty voice recognition package. It’s uncannily good now. Whilst I tried to route the output sound into the mic by using Soundflower (I wasted time doing this as when the transcription did yield readable text, it used words worthy of inciting a Mary Whitehouse campaign) I did find that dictation provided a rest for weary fingers. After a while, you will probably need to rest a weary voice, so you can switch back to typing.

When subjects starting talking quickly, I needed a way to slow them down as constantly pressing pause and rewind got onerous. A quick fix for this was to download Audacity. This has the function to slow down your sound files. Once the comedic effect of voice alteration has worn off, it becomes easier to transcribe as you don’t have to pause and rewind as much.

Process wise, it doesn’t sound much and it isn’t. It’s just the sheer hours of audio that needs to be made legible through listening, rewinding an typing.

How can the process be made (slightly) easier?

  • Investigate transcription technology and processes. Investigate technologies available beforehand that you can access. I wish I had done this rather than rely on the expectation that I would be just listening and typing. I didn’t find a website with the answer but a thoughtful web search can help you with certain parts of the transcription method.
  • Talk slowly. This one doesn’t apply to the transcription process but the interview process. Try and ask the questions a little bit slower than you usually would as the respondent will subconsciously mimic your speed of delivery and slow themselves down

Hang on in there, it’s worth it

Even if you choose to incorporate the suggestions above, be under absolutely no illusions: transcription is a gruelling task. That’s not a slight against the participants’ responses for they will be genuinely interesting and insightful. No, it’s a comment on the frustration of the process and sheer mental grind of getting through it. I must admit I had only come to a reasonably happy transcription method by the time I had reached number fourteen (of fifteen). However, the effort is completely worth it. In the end, I now have around 65,000 quality words (research data) to analyse to understand what existing digital skills, knowledge, ways of learning and managing change exist within my institution that can be fed into the development of digital preservation skills and knowledge.

Skills interviewing using the DPOC skills interview toolkit

Cambridge Outreach & Training Fellow, Lee, shares his experiences in skills auditing.


As I am nearing the end of my fourteenth transcription and am three months into skills interview process, now is a good time to pause and reflect. This post will look at the experience of the interview process using the DPOC digital preservation skills toolkit. this toolkit is currently under development; we are learning and improving it as we trial it at Cambridge and Oxford.

Step 1: Identify your potential participants

To understand colleagues’ use of technology and training needs, a series of interviews were arranged. We agreed that a maximum sample of 25 participants would give us plenty (perhaps too much?) of material to work with. Before invitations were sent out, a list was made up of potential participants. In building the list, a set of criteria ensured that a broad range of colleagues were captured. This criteria consisted of:

  • in what department or library do they work?
  • is there a particular bias of colleagues from a certain department or library and can this be redressed?
  • what do they do?
  • is there a suitable practitioner to manager ratio?

The criteria relies on you having a good grasp of your institution, its organisation and the people within it. If you are unsure, start asking managers and colleagues who do know your institution very well—you will learn a lot! It is also worth having a longer list than your intended maximum in case you do not get responses, or people are not available or do not wish to participate.

Step 2: Inviting your potential participants

Prior to sending out invitations, the intended participant’s managers were consulted to see if they would agree to their staff time being used in this way. This was also a good opportunity to continue awareness raising of the project as well as getting buy-in to the the interview process.

The interviews were arranged in blocks of five to make planning around other work easier.

Step 3: Interviewing

The DPOC semi-structured skills interview questions were put to the test at this step. Having developed the questions beforehand ensured I covered the necessary digital preservation skills during the interview.

Here are some tips I gained from the interview process which helped to get some great responses.

  • Offer refreshments before the interview. Advise beforehand that a generous box of chocolate biscuits will be available throughout proceeding. This also gives you an excellent chance to talk informally to your subject and put them at ease, especially if they appear nervous.
  • If using, make sure your recording equipment is working. There’s nothing worse than thinking you have fifty minutes of interview gold only to find that you’ve not pressed play or the device has run out of power. Take a second device, or if you don’t want the technological hassle, use pen(cil) and paper.
  • Start with colleagues that you know quite well. This will help you understand the flow of the questions better and they will not shy away from honest feedback.
  • Always have printed copies of interview questions. Technology almost always fails you.

My next post will be about transcribing and analysing interviews.

Outreach and Training Fellows visit CoSector, University of London

Outreach & Training Fellow, Lee, chronicles his visit with Sarah to meet CoSector’s Steph Taylor and Ed Pinsent.


On Wednesday 29 March, a date forever to be associated with the UK triggering of Article 50, Sarah and Lee met with CoSector’s Stephanie Taylor and Ed Pinsent in the spirit of co-operation. For those that don’t know, Steph and Ed are behind the award-winning Digital Preservation Training  Programme.

Russell Square was overcast but it was great to see that London was still business as usual with its hallmark traffic congestion and bus loads of sightseers lapping up the cultural hotspots. Revisiting the University of London’s Senate House is always a visual pleasure and it’s easy to see why it was home to the Ministry of Information: the building screams order and neat filing.

Senate House, University of London

Senate House, University of London. Image credit: By stevecadman – http://www.flickr.com/photos/stevecadman/56350347/, CC BY-SA 2.0, https://commons.wikimedia.org/w/index.php?curid=6400009

We were keen to speak to Steph and Ed to tell them more about the DPOC Project to date and where we were at with training developments. Similarly, we were also keen to learn about the latest developments from CoSector’s training plans and we were interested to hear that CoSector will be developing their courses into more specialist areas of digital preservation, so watch this space… (well at least, the CoSector space).

It was a useful meeting because it gave us the opportunity to get instant feedback on the way the project is working and where we could help to feed into current training and development needs. In particular, they were really interested to learn about the relationship between the project team and IT. Sarah and I feel that because we have access to two technical IT experts who are on board and happy to answer our questions—however simple they may be from an IT point of view—we feel that it is easier to understand IT issues. Similarly, we find that we have better conversations with our colleagues who are Developers and Operations IT specialists because we have a linguistic IT bridge with our technical colleagues.

It was a good learning opportunity and we hope to build upon this first meeting in the future as a part of sustainable training solution.

Training begins: personal digital archiving

Outreach & Training Fellow, Sarah, has officially begun training and capacity building with session on personal digital archiving at the Bodleian Libraries. Below Sarah shares how the first session went and shares some personal digital archiving tips.


Early Tuesday morning and the Weston Library had just opened to readers. I got to town earlier than usual, stopping to get a Melbourne-style flat white at one of my favourite local cafes – to get in me in the mood for public speaking. By 9am I was in the empty lecture theatre, fussing over cords, adjusting lighting and panicking of the fact I struggled to log in to the laptop.

At 10am, twenty-one interested faces were seated with pens at the ready; there was nothing else to do but take a deep breath and begin.

In the 1.5 hour session, I covered the DPOC project, digital preservation and personal digital archiving. The main section of the training was learning about personal digital archiving, preservation lifecycle and the best practice steps to follow to save your digital stuff!

The steps of the Personal Digital Archiving & Preservation Lifecycle are intended to help with keeping your digital files organised, findable and accessible over time. It’s not prescriptive advice, but it is a good starting point for better habits in your personal and work lives. Below are tips for every stage of the lifecycle that will help build better habits and preserve your valuable digital files.

Keep Track and Manage:

  • Know where your digital files are and what digital files you have: make a list of all of the places you keep your digital files
  • find out what is on your storage media – check the label, read the file and folder names, open the file to see the content
  • Most importantly: delete or dispose of things you no longer need.
    • This includes: things with no value, duplicates, blurry images, previous document versions (if not important) and so on.

Organise:

  • Use best practice for file naming:
    • No spaces, use underscores _ and hyphens – instead
    • Put ‘Created Date’ in the file name using yyyymmdd format
    • Don’t use special characters <>,./:;'”\|[]()!@£$%^&*€#`~
    • Keep the name concise and descriptive
    • Use a version control system for drafts (e.g. yyyymmdd_documentname_v1.txt)
  • Use best practice for folder naming;
    • Concise and descriptive names
    • Use dates where possible (yyyy or yyyymmdd)
    • keep file paths short and avoid a deep hierarchy
    • Choose structures that are logical to you and to others
  • To rename large groups of image files, consider using batch rename software

Describe:

  • Add important metadata directly into the body of a text document
    • creation date & version dates
    • author(s)
    • title
    • access rights & version
    • a description about the purpose or context of the document
  • Create a README.txt file of metadata for document collections
    • Be sure to list the folder names and file names to preserve the link between the metadata and the text file
    • include information about the context of the collection, dates, subjects and relevant information
    • this is a quick method for creating metadata around digital image collections
  • Embed the metadata directly in the file
  • for image and video: be sure to add subjects, location and a description of the trip or event
  • Add tags to documents and images to aid discoverability
  • Consider saving the ‘Creation Date’ in the file name, a free text field in the metadata, in the document header or in a README text file if it is important to you. In some cases transferring the file (copying to new media, uploading to cloud storage) will change the creation date and the original date will be lost. The same goes for saving as a different file type. Always test before transfer or ‘Save As’ actions or record the ‘Creation Date’ elsewhere.

Store:

  • Keep two extra backups in two geographically different locations
  • Diversify your backup storage media to protect against potential hardware faults
  • Try to save files in formats better suited to long-term access (for advice on how to choose file formats, visit Stanford University Libraries)
  • refresh your storage media every three to five years to protect against loss of hardware failure
  • do annual spot checks, including checking all backups. This will help check for any loss, corruption or damaged backups. Also consider checking all of the different file types in your collection, to ensure they are still accessible, especially if not saved in a recommended long-term file format.

Even I can admit I need better personal archiving habits. How many photographs are still on my SD cards, waiting for transfer, selection/deletion and renaming before saving in a few choice safe backup locations? The answer is: too many. 

Perhaps now that my first training session is over, I should start planning my personal side projects. I suspect clearing my backlog of SD cards is one of them.

Useful resources on personal digital archiving:

DPC Technology Watch Report, “Personal digital archiving” by Gabriela Redwine

DPC Case Note, “Personal digital preservation: Photographs and video“, by Richard Wright

Library of Congress “Personal Archiving” website, which includes guidance on preserving specific digital formats, videos and more

 

DPC Student Conference – What I Wish I Knew Before I Started

At the end of January, I went to the Chancellor’s Hall at the University of London’s Art Deco style Senate House. Near to the entrance of the Chancellor’s Hall was Room 101. Rumours circulated amongst the delegates keenly awaiting the start of the conference that the building and the room were the inspiration for George Orwell’s Nineteen Eighty-Four.

Instead of facing my deepest and darkest digital preservation fears in Senate House, I was keen to see and hear what the leading digital preservation trainers and invited speakers at different stages of their careers had to say. For the DPOC project, I wanted to see what types of information were included in introductory digital preservation training talks, to witness the styles of delivery and what types of questions the floor would raise to see if there were any obvious gaps in the delivery. For the day’s programme, presenters’ slides and Twitter Storify, may I recommend that you visit the DPC webpage for this event:

http://www.dpconline.org/events/past-events/wiwik-2017

The take away lesson from the day, is just do something, don’t be afraid to start. Sharon McMeekin showed us how much the DPC can help (see their new website, it’s chock full of digital preservation goodness) and Steph Taylor from CoSense showed us that you can achieve a lot in digital preservation just through keeping an eye on emerging technologies and that you spend most of your time advocating that digital preservation is not just backing up. Steph also reinforced to the student delegation that you can approach members of the digital preservation community, they are all very friendly!

From the afternoon session, Dave Thompson reminded those assembled that we also need to think about the information age that we live in, how people use information, how they are their own gatekeepers to their digital records and how recordkeepers need to react to these changes, which will require a change in thinking from traditional recordkeeping theory and practice. As Adrian Brown put it for digital archivists, “digital archivists are archivists with superpowers”. One of those superpowers is the ability to adapt to your working context and the technological environment. Digital preservation is a constantly changing field and the practitioner needs to be able to adapt and change to the environment around them in a chameleon like manner to get their institution’s work preserved. Jennifer Febles reminded us that is also OK to say that “you don’t know” when training people, you can go away and learn or even learn from other colleagues. As for the content of the day, there were no real gaps, the day programme was spot on as far as I could tell from the delegates.

Whilst reflecting on the event on the journey back on the train (and whilst simultaneously being packed into the stifling hot carriage like a sweaty sardine), the one thing that I really wanted to find out was what the backgrounds of the delegates were. More specifically, what ‘information schools’ they were attending, what courses they were undertaking, how much their modules concerned digital recordkeeping and their preservation, and, most importantly, what they are being taught in those modules.

My thoughts then drifted towards thinking of those who have been given the label of ‘digital preservation experts’. They have cut their digital preservation teeth after their formal qualifications and training in an ostensibly different subject. Through a judicious application and blending of discipline-specific learning, learning about related fields they then apply this learning to their specific working context. Increasingly, in the digital world, those from a recordkeeping background need to embrace computer science skills and applications, especially for those where coding and command line operation is not a skill they have been brought up with. We seem to be at a point where the leading digital preservation practitioners are plying their trade (as they should) and not teaching their trade in a formal education setup. A very select few are doing both but if we pulled practitioners into formal digital preservation education programmes, would we then drain the discipline of innovative practice? Should digital preservation skills (which DigCurV has done well to define) be better suited to one big ‘on the job’ learning programme rather than more formal programmes. A mix of both would be my suggestion but this discussion will never close.

Starting out in digital preservation may seem terribly daunting, with so much to learn as there is so much going on. I think that the ‘information schools’ can equip students with the early skills and knowledge but from then on, the experience and skills is learned on the job. The thing that makes the digital preservation community standout is that people are not afraid to share their knowledge and skills for the benefit of preserving cultural heritage for the future.

Post-holiday project update

You may be forgiven for thinking that the DPOC project has gone a little quiet since the festive period. In this post, Sarah summarises the work that continues at a pace.


The Christmas trees have been recycled, the decorations returned to attics or closets, and the last of the mince pies have been eaten. It is time to return to project work and face the reality that we are six months into the DPOC project. That leaves us one and a half years to achieve our aims and bring useful tools and recommendations to Cambridge, Oxford, and the wider digital preservation community. This of course means we’re neck-deep in reporting at the moment, so things have seemed a bit quiet.

So what does that mean for the project at the moment?

Myscreen

A view of my second screen at the moment. The real challenge is remembering which file I am editing. (Image credit: Sarah Mason)

At both Cambridge and Oxford, all Fellows are working on drafting collection audit reports and reviewing various policies. The Outreach & Training Fellows are disseminating their all staff awareness survey and will be compiling the results from it in February. At Oxford, semi-structured interviews with managers and practitioners working with digital collections is in full swing. At Cambridge, the interviews will start after the awareness survey results have been analysed. This is expected to last through until March – holidays and illnesses willing! The Oxford team is getting their new Technical Fellow, James, up to speed with the project. Cambridge’s Technical Fellow is speaking with many vendors and doing plenty of analysis on the institutional repository.

For those of you attending IDCC in Edinburgh in February, look for our poster on our TRAC and skills audits on our institutional repositories. Make sure to stop by to chat to us about our methodology and early results!

We’re also going to visit colleagues at a number of institutions around the UK over the next few months, seeing some technical systems in action and learning about their staff skills and policies. This knowledge sharing is crucial to the DPOC project, but also the growth of the digital preservation community.

And it’s been six months since the start of the project, so we’re all in reporting mode, writing up and looking over our achievements for the past 6 months. After the reports have been drafted, redrafted, and finalised, expect a full update and some reflections on how this collaborative project is going.

The digital preservation gap(s)

Somaya’s engaging, reflective piece identifies gaps in the wider digital preservation field and provides insightful thoughts as to how the gaps can be narrowed or indeed closed.


I initially commenced this post as a response to the iPres 2016 conference and an undercurrent that caught my attention there – however, really it is a broader comment on field of digital preservation itself. This post ties into some of my thoughts that have been brewing for several years about various gaps I’ve discovered in the digital preservation field. As part of the Polonsky Digital Preservation Project, I hope we will be able to do some of the groundwork to begin to address a number of these gaps.

So what are these gaps?

To me, there are many. And that’s not to say that there aren’t good people working very hard to address them – there are. (I should note that these people often do this work as part of their day jobs as well as evenings and weekends.)

Specifically, the gaps (at least the important ones I see) are:

  • Silo-ing of different areas of practice and knowledge (developers, archivists etc.)
  • Lack of understanding of working with born-digital materials at the coalface (including managing donor relationships)
  • Traditionally-trained archivists, curators and librarians wanting a ‘magic wand’ to deal with ‘all things digital’
  • Tools to undertake certain processes that do not currently exist (or do not exist for the technological platform or limitation archivists, curators, and librarians are having to work with)
  • Lack of existing knowledge of command line and/or coding skills in order to run the few available tools (skills that often traditionally-trained archivists, curators, and librarians don’t have under their belt)
  • Lack of knowledge of how to approach problem-solving

I’ve sat at the nexus between culture and technology for over two decades and these issues don’t just exist in the field of digital preservation. I’ve worked in festival and event production, radio broadcast and as an audiovisual tech assistant. I find similar issues in these fields too. (For example, the sound tech doesn’t understand the type of music the musician is creating and doesn’t mix it the right way, or the artist requesting the technician to do something not technically possible.) In the digital curation and digital preservation contexts, effectively I’ve been a translator between creators (academics, artists, authors, producers etc.), those working at the coalface of collecting institutions (archivists, curators and librarians) and technologists.

To me, one of the gaps was brought to the fore and exacerbated during the workshop: OSS4Pres 2.0: Building Bridges and Filling Gaps which built on the iPres 2015 workshop “Using Open-Source Tools to Fulfill Digital Preservation Requirements”. Last year I’d contributed my ideas prior to the workshop, however I couldn’t be there in person. This year I very much wanted to be part of the conversation.

What struck me was the discussion still began with the notion that digital preservation commences at the point where files are in a stable state, such as in a digital preservation system (or digital asset management system). Appraisal and undertaking data transfers wasn’t considered at all, yet it is essential to capture metadata (including technical metadata) at this very early point. (Metadata captured at this early point may turn into preservation metadata in the long run.)

I presented a common real-world use case/user story in acquiring born-digital collections: A donor has more than one Mac computer, each running different operating systems. The archivist needs to acquire a small selection of the donor’s files. The archivist cannot install any software onto the donor’s computers, ask them to install any software and only selected the files must be collected – hence, none of the computers can be disk imaged.

The Mac-based tools that exist to do this type of acquisition rely on Java software. Contemporary Mac operating systems don’t come with Java installed by default. Many donors are not competent computer users. They haven’t installed this software as they have no knowledge of it, need for it, or literally wouldn’t know how to. I put this call out to the Digital Curation Google Groups list several months ago, before I joined the Polonsky Digital Preservation Project. (It followed on from work that myself and my former colleagues at the National Library of Australia had undertaken to collect born-digital manuscript archives, having first run into this issue in 2012.) The response to my real-world use case at iPres was:

This final option is definitely not possible in many circumstances, including when collecting political archives from networked environments inside government buildings (another real-world use case I’ve had first-hand experience of). The view was that anything else isn’t possible or is much harder (yes, I’m aware). Nevertheless, this is the reality of acquiring born-digital content, particularly unpublished materials. It demands both ‘hard’ and ‘soft’ skills in equal parts.

The discussion at iPres 2016 brought me back to the times I’ve previously thought about how I could facilitate a way for former colleagues to spend “a day in someone else’s shoes”. It’s something I posed several times when working as a Producer at the Australian Broadcasting Corporation.

Archivists have an incredible sense of how to manage the relationship with a donor who is handing over their life’s work, ensuring the donor entrusts the organisation with the ongoing care of their materials. However traditionally trained archivists, curators and librarians typically don’t have in-depth technical skillsets. Technologists often haven’t witnessed the process of liaising with donors first-hand. Perhaps those working in developer and technical roles, which is typically further down the workflow for processing born-digital materials need opportunities to observe the process of acquiring born-digital collections from donors. Might this give them an increased appreciation for the scenarios that archivists find themselves in (and must problem-solve their way out of)? Conversely, perhaps archivists, curators and librarians need to witness the process of developers creating software (especially the effort needed to create a small GUI-based tool for collecting born-digital materials from various Mac operating systems) or debug code. Is this just a case of swapping seats for a day or a week? Definitely sharing approaches to problem-solving seems key.

Part of what we’re doing as part of the Polonsky Digital Preservation Project is to start to talk more holistically, rather than the term ‘digital preservation’ we’re talking about ‘digital stewardship’. Therefore, early steps of acquiring born-digital materials aren’t overlooked. As the Policy and Planning Fellow at Cambridge University Library, I’m aware I can affect change in a different way. Developing policy –  including technical policies (for example, the National Library of New Zealand’s Preconditioning Policy, referenced here) – means I can draw on my first-hand experience of acquiring born-digital collections with a greater understanding of what it takes to do this type of work. For now, this is the approach I need to take and I’m looking forward to the changes I’ll be able to influence.


Comments on Somaya’s piece would be most welcome. There’s plenty of grounds for discussion and constructive feedback will only enhance the wider, collaborative approach to addressing the issue of preserving digital content.