Update on the training programme pilot

Sarah, Oxford’s Outreach and Training Fellow, has been busy since the new year designing and a running a digital preservation training programme pilot in Oxford. It consisted of one introductory course on digital preservation and six other workshops. Below is an update on what she did for the pilot and what she has learnt over the past few months.

It’s been a busy few months for me, so I have been quiet on the blog. Most of my time and creative energy has been spent working on this training programme pilot. In total, there were seven courses and over 15 hours of material. In the end, I trialled the courses on over 157 people from Bodleian Libraries and the various Oxford college libraries and archives. Many attendees were repeats, but some were not.

The trial gave me an opportunity to test out different ideas and various topics. Attendees were good at giving feedback, both during the course and after via an online survey. It’s provided me with further ideas and given me the chance to see what works or what doesn’t. I’ve been able to improve the experience each time, but there’s still more work to be done. However, I’ve already learned a lot about digital preservation and teaching.

Below are some of the most important lessons I’ve learned from the training programme pilot.

Time: You always need more

I found that I almost always ran out of time at the end of a course; it left no time for questions or to finish that last demo. Most of my courses could have either benefited from less content, shorter exercises, or just being 30 minutes longer.

Based on feedback from attendees, I’ll be making adjustments to every course. Some will be longer. Some will have shorter exercises with more optional components and some will have slightly less content.

While you might budget 20 minutes for an activity, you will likely use 5-10 minutes more. But it varies every time due to the attendees. Some might have a lot of questions, but others will be quieter. It’s almost better to overestimate the time and end early than rush to cover everythhing. People need a chance to process the information you give them.

Facilitation: You can’t go it alone

In only one of my courses did I have to facilitate alone. I was run off my feet for the 2 hours because it was just me answering questions during  exercises for 15 attendees. It doesn’t sound like a lot, but I had a hoarse voice by the end from speaking for almost 2 hours!

Always get help with facilitation—especially for workshops. Someone to help:

  • answer questions during exercises,
  • get some of the group idea exercises/conversations started,
  • make extra photocopies or print outs, and
  • load programs and files onto computers—and then help delete them after.

It is possible to run training courses alone, but having an extra person makes things run smoother and saves a lot of time. Edith and James have been invaluable support!

Demos: Worth it, but things often go wrong

Demos were vital to illustrate concepts, but they were also sometimes clunky and time consuming to manage. I wrote up demo sheets to help. The demos relied on software or the Internet—both which can and will go wrong. Patience is key; so is accepting that sometimes things will not go right. Processes might take a long time to run or the course concludes before the demo is over.

The more you practice on the computer you will be using, the more likely things will go right. But that’s not always an option. If it isn’t, always have a back up plan. Or just apologise, explain what should have happened and move on. Attendees are generally forgiving and sometimes it can be turned into a really good teaching moment.

Exercises: Optional is the way to go

Unless you put out a questionnaire beforehand, it is incredibly hard to judge the skill level of your attendees. It’s best to prepare for all levels. Start each exercise slow and have a lot of optional work built in for people that work faster.

In most of my courses I was too ambitious for the time allowed. I wanted them to learn and try everything. Sometimes I wasn’t asking the right questions on the exercises either. Testing exercises and timing people is the only way to tailor them. Now that I have run the workshops and seen the exercises in action, I have a clearer picture of what I want people to learn and accomplish—now I just have to make the changes.

Future plans

There were courses I would love to run in the future (like data visualisation and digital forensics), but I did not have the time to develop. I’d like to place them on a roadmap for future training. As well as reaching out more to the Oxford colleges, museums and other departments. I would also like to tailor the introductory course a bit more for different audiences.

I’d like to get involved with developing courses like Digital Preservation Carpentry that the University of Melbourne is working on. The hands-on workshops excited and challenged me the most. Not only did others learn a lot, but so did I. I would like to build on that.

At the end of this pilot, I have seven courses that I will finalise and make available through a creative commons licence. What I learned when trying to develop these courses is that there isn’t always a lot of good templates available on the Internet to use as a starting point—you have to ask around for people willing to share.

So, I am hoping to take the work that I’ve done and share it with the digital preservation community. I hope they will be useful resources that can be reused and repurposed. Or at the very least, I hope it can be used as a starting point for inspiration (basic speakers notes included).

These will be available via the DPOC website sometime this summer, once I have been able to make the changes necessary to the slides and exercises—along with course guidance material. It has been a rewarding experience (as well as an exhausting one); I look forward to developing and delivering more digital preservation training in the future.

Breaking through with Library Carpentry

Thursday 11th January saw the Cambridge University Library’s annual conference take place. This year, it was entitled ‘Breakthrough the Library’, and focused on cutting-edge innovation in libraries and archives. I can honestly say that this was the first ever conference I’ve been to where every single speaker I saw (including the ten or so who gave lightning talks) were absolutely excellent.

So it’s hard to pick the one that made the most impression. Of course, an honourable mention must go to the talk about Jasper the three legged cat, but if I had to plump for the one that was most pertinent to moving Digital Preservation forward, I’d have picked “Library Carpentry: software and data skills for librarian professionals”, from Dr James Baker of the University of Sussex.

I’d heard of the term ‘Library Carpentry’ (and the initiatives it stems from – Software Carpentry and Data Carpentry) and thus had an idea what the talk was about on the way in. Their web presence explains things far better than I can, too (see https://librarycarpentry.github.io/), so I’m going to skip the exposition and make a different point…

As a full-blown, time-served nerd who’s clearly been embittered by 20 years in the IT profession (though I’m pleased to report, not as much as most of my long-term friends and colleagues!), I went into the talk with a bit of a pessimistic outlook. This was because, in my experience, there are three stages one passes through when learning IT skills:

  • Stage 1: I know nothing. This computer is a bit weird and confuses me.
  • Stage 2: I know EVERYTHING. I can make this computer sing and dance, and now I have the power to conquer the world.
  • Stage 3: … er – hang on… The computer might not have been doing exactly what I thought it was, after all… Ooops! What did I just do?

Stage 1 is just something you get through (if you want – I have nothing but respect for happy Stage 1 dwellers, though). If so inclined, all it really takes is a bit of persistence and a dollop of enthusiasm to get through it. If you want to but think you might struggle, then have a go at this computer programming aptitude test from the University of Kent – you may be pleasantly surprised… In my own case, I got stuck there for quite a while until one day a whole pile of O Level algebra that was lurking in my brain suddenly rose out of the murk, and that was that.

Stage 2 people, on the other hand, tend to be really dangerous… I have personally worked with quite a few well-paid developers who are stuck in Stage 2, and they tend to be the ones who drop all the bombs on your system. So the faster you can get through to Stage 3, the better. This was at the root of my concern, as one of the ideas of Library Carpentry is to pick up skills quick, and then pass them on. But I needn’t have worried because…

When I asked Dr Baker about this issue, he reassured me that ‘questioning whether the computer has done what you expected’ is a core learning point that is central to Library Carpentry, too. He also declared the following (which I’m going to steal): “I make a point of only ever working with people with Impostor Syndrome”.

Hence it really does look as if getting to Stage 3 without even going through Stage 2 at all is what Library Carpentry is all about. I believe moves are afoot to get some of this good stuff going at Cambridge… I watch with interest and might even be able to find the time to join in..? I bet it’ll be fun.

Towards a common understanding?

Cambridge Outreach and Training Fellow, Lee, describes the rationale behind trialling a recent workshop on archival science for developers, as well as reflecting on the workshop itself. Its aim was to get those all those working in digital preservation within the organisation to have a better understanding of each other’s work to improve co-operation for a sustainable digital preservation effort.

Quite often, there is a perceived language barrier due to the wide range of practitioners that work in digital preservation. We may be using the same words, but there’s not always a shared common understanding of what they mean. This became clear when I was sitting next to my colleague, a systems integration manager, at an Archivematica workshop in September. Whilst not a member of the core Cambridge DPOC team, our colleague is a key member of our extended digital preservation network at Cambridge University Library a is a key member for development for understanding and retaining digital preservation knowledge in the institution.

For those from a recordkeeping background, the design principles behind the front end of Archivematica should be obvious, as it incorporates both traditional principles of archival practice and features of the OAIS model. However, coming from a systems integration point of view, there was a need to have to translate for my colleague words such as ‘accession’, ‘appraisal’ and ‘arrangement’, which many of us with archival education take their meanings for granted.

I asked my colleague if an introductory workshop on archival science would be useful, and she said, “yes, please!” Thus, the workshop was born. Last week, a two and a half hour workshop was trialled for members of our developer and systems integration colleagues. The aim of the workshop was to enable them to understand what archivists are taught on postgraduate courses and how this teaching informs their practice. After understanding the attendees’ impressions of an archivist and the things that they do (see image) the workshop then practically explored how an archivist would acquire and describe a collection. The workshop was based on an imaginary company, complete with a history and description of the business units and examples of potential records they would deposit. There were practical exercises on making an accession record, appraising a collection, artificial arrangement and subsequent description through ISAD(G).

Sticky notes about archivists

Sticky notes about archivists from a developer point of view.

Having then seen how an archivist would approach a collection, the workshop moved into explaining physical storage and preservation before moving onto digital preservation, specifically looking at OAIS and then examples of digital preservation software systems. One exercise was to get the attendees to use what they had learned in the workshop to see where archival ideas mapped onto the systems.

The workshop tried to demonstrate how archivists have approached digital preservation armed with the professional skills and knowledge that they have. The idea was to inform to teams working with archivists and the digital preservation of how archivists think and how and why some of the tools and products are design in the way that they are. My hope was for ‘IT’ to understand the depth of knowledge that archivists have in order to help everyone work together on a collaborative digital preservation solution.

Feedback was positive and it will be run again in the New Year. Similarly, I’m hoping to devise a course from a developer perspective that will help archivists communicate more effectively with developers. Ultimately, both will be working from a better level of understanding each other’s professional skill sets. Co-operation and collaboration on digital preservation projects will become much easier across disciplines and we’ll have a better informed (and relaxed) environment to share practices and thoughts.

PASIG 2017: honest reflections from a trainee digital archivist

A guest blog post by Kelly, one of the Bodleian Libraries’ graduate digital archivist trainees, on what she learned as a volunteer and attendee of PASIG 2017 Oxford.

Amongst the digital preservation professionals from almost every continent and 130 institutions, myself and my 5 traineeship colleagues were amongst the lecture theatre seats, annexe demos and the awesome artefacts at the Museum of Natural History for PASIG 2017, Oxford. It was a brilliant opportunity at just 6 months into our traineeship to not only apply some of our new knowledge to work at Special Collections, Bodleian Libraries, but we were also able to gain a really current and relevant insight to theories we have been studying as part of our long distance MSc in Digital Curation at Aberystwyth University. The first ‘Bootcamp’ day was exactly what I needed to throw myself in, and it really consolidated my confidence in my understanding of some aspects of the shared language that is used amongst the profession (fixity checks, maturity models…as well as getting to grips with submission information packages, dissemination information packages and everything that occurs in between!).

My pen didn’t stop scribbling all three days, except maybe for tea breaks. Saying that, the demo presentations were also a great time for myself and other trainees to ask questions specifically about workflows and benefits of certain software such as LibNova, Preservica and ResourceSpace.

For want of a better word (and because it really is the truth) PASIG 2017 was genuinely inspiring and there were messages delivered so powerfully I hope that I stay grounded in these for my entire career. Here is what I was taught:

The Community is invaluable. Many of the speakers were quick to assert that sharing practice amongst the digital preservation community is key. This is a value I was familiar with, yet witnessing it happening throughout the conference in such a sincere manner. I can assure you the gratitude and affirmation that followed Eduardo del Valle, University of the Balearic Islands and his presentation: “Sharing my loss to protect your data: A story of unexpected data loss and how to do real preservation” was as encouraging to witness as someone new to the profession as it was to all of the other experienced delegates present. As well as sharing practice, it was clear that the community need to be advocating on behalf of each other. It is time and resource consuming but oh-so important.

Digital archives are preserving historical truths. Yes, the majority of the workflow is technological but the objectives and functions are so much more than technology; to just reduce digital preservation down to this is an oversimplification. It was so clear that the range of use cases presented at PASIG were all driven towards documenting social, political, historical information (and preserving that documentation) that will be of absolute necessity for society and infrastructure in future. Right now, for example, Angeline Takewara and her colleagues at UN MICT are working on a digital preservation programme to ensure absolute accountability and usability of the records of the International Criminal Tribunals of both Rwanda and Yugoslavia. I have written a more specific post on Angeline’s presentation here.

Due to the nature of technology and the digital world, the goalposts will always be moving. For example, Somaya Langley’s talk on the future of digital preservation and the mysteries of extracting data from smart devices will soon become (and maybe already is) a reality for those working with accessions of archives or information management. We should, then, embrace change and embrace the unsure and ultimately ‘get over the need for tidiness’ as pointed out by John Sheridan from The National Archives during his presentation “Creating and sustaining a disruptive digital archive” . This is usually counter-intuitive, but as the saying goes, one of the most dangerous phrases to use is ‘we’ve always done it that way’.

The value of digital material outlives the software, so the enabling of prolonged use of software is a real and current issue. Admittedly, this was a factor I had genuinely not even considered before. In my brain I linked obsolescence with hardware and hardware only. Therefore,  Dr. Natasa Milic-Frayling’s presentation on “Aging of Digital: Managed Services for digital continuity” shed much light on the changing computing ecosystem and the gradual aging of software. What I found especially interesting about the proposed software-continuity plan was the transparency of it; the fact that the client can ask to see the software at any time whilst it is being stabilised and maintained.

Thank you so much PASIG 2017 and everybody involved!

One last thing…in closing, Cliff Lynch, CNI, bought up that there was comparably less Web Archiving content this year. If anybody fancies taking a trainee to Mexico next year to do a (lightning) talk on Bodleian Libraries’ Web Archive I am keen…



Visit to the Parliamentary Archives: Training and business cases

Edith Halvarsson, Policy and Planning Fellow at Bodleian Libraries, writes about the DPOC project’s recent visit to the Parliamentary Archives.

This week the DPOC fellows visited the Parliamentary Archives in London. Thank you very much to Catherine Hardman (Head of Preservation and Access), Chris Fryer (Digital Archivist) and Grace Bell (Digital Preservation Trainee) for having us. Shamefully I have to admit that we have been very slow to make this trip; Chris first invited us to visit all the way back in September last year! However, our tardiness to make our way to Westminster was in the end aptly timed with the completion of year one of the DPOC project and planning for year 2.

Like CUL and Bodleian Libraries, the Parliamentary Archives also first began their own Digital Preservation Project back in 2010. Their project has since transitioned into digital preservation in a more programmatic capacity as of 2015. As CUL and Bodleian Libraries will be beginning to draft business cases for moving from project to programme in year 2; meeting with Chris and Catherine was a good opportunity to talk about how you start making that tricky transition.

Of course, every institution has its own drivers and risks which influence business cases for digital preservation, but there are certain things which will sound familiar to a lot of organisations. For example, what Parliamentary Archives have found over the past seven years, is that advocacy for digital collections and training staff in digital preservation skills is an ongoing activity. Implementing solutions is one thing, whereas maintaining them is another. This, in addition to staff who have received digital preservation training eventually moving on to new institutions, means that you constantly need to stay on top of advocacy and training. Making “the business case” is therefore not a one-off task.

Another central challenge in terms of building business cases, is how you frame digital preservation as a service rather than as “an added burden”. The idea of “seamless preservation” with no human intervention is a very appealing one to already burdened staff, but in reality workflows need to be supervised and maintained. To sell digital preservation, that extra work must therefore be perceived as something which adds value to collection material and the organisation. It is clear that physical preservation adds value to collections, but the argument for digital preservation can be a harder sell.

Catherine had, however, some encouraging comments on how we can attempt to turn advice about digital preservation into something which is perceived as value adding.  Being involved with and talking to staff early on in the design of new project proposals – rather than as an extra add on after processes are already in place – is an example of this.

Image by James Mooney

All in all, it has been a valuable and encouraging visit to the Parliamentary Archives. The DPOC fellows look forward to keeping in touch – particularly to hear more about the great work Parliamentary Archive have been doing to provide digital preservation training to staff!

Transcribing interviews

The second instalment of Lee’s experience running a skills audit at Cambridge University Library. He explains what is needed to be able to transcribe the lengthy and informative interviews with staff.

There’s no ground-breaking digital preservation goodness contained within this post so you have permission to leave this page now. However, this groundwork is crucial to gaining an understanding of how institutions can prepare for digital preservation skills and knowledge development. It may also be useful to anyone who is preparing to transcribe recorded interviews.

Post-interview: transcribing the recording

Once you have interviewed your candidates and made sure that you have all the recordings (suitably backed up three times into private, network free storage like an encrypted USB stick so as to respect privacy wishes), it is time to transcribe.

So, what do you need?

  • A very quiet room. Preferably silence, where there are no distractions and where you can’t distract people. You may wish to choose the dictation path and if you do that in an open plan office, you may attract attention. You will also be reciting information that you have assured will remain confidential.
  • Audio equipment. You will need a device that can play your audio files and has an audio control player built into it. You can use your device’s speakers, headphones, preferably with a control device built into the wire, or foot pedal.
  • Time. Bucket loads of it. If you are doing other work, this needs to become the big rock in your time planning, everything else should be mere pebbles and sand. This is where manager support is really helpful, as is…
  • Understanding. The understanding that this will rule your working life for the next month or two and the understanding of those around the size of the task of what you are doing. To have an advocate who has experience of this type of work before is invaluable.
  • Patience. Of a saint.
  • Simple transcription rules. Given the timeframes of the project, complex transcription would have been too time consuming. Please see the following work below, as used by the University of California, San Diego, it’s really useful with nice big text.
    Dresing, Thorsten/Pehl, Thorsten/Schmieder, Christian (2015): Manual (on) Transcription. Transcription Conventions, Software Guides and Practical Hints for Qualitative Researchers. 3rd English Edition. Marburg Available Online: http://www.audiotranskription.de/english/transcription-practicalguide.htm
    (Last accessed: 27.06.2017). ISBN: 978-3-8185-0497-7.

Cropped view of person hands typing on laptop computer. Image credit: Designed by Freepik

What did you do?

Using a Mac environment, I imported the audio files for transcription into a desktop folder and created a play list in iTunes. I reduced the iTunes application to the mini player view and opened up Word to type into. I plugged in my headphones and pressed play and typed as I was listening.

If you get tired typing, the Word application on my Mac has a nifty voice recognition package. It’s uncannily good now. Whilst I tried to route the output sound into the mic by using Soundflower (I wasted time doing this as when the transcription did yield readable text, it used words worthy of inciting a Mary Whitehouse campaign) I did find that dictation provided a rest for weary fingers. After a while, you will probably need to rest a weary voice, so you can switch back to typing.

When subjects starting talking quickly, I needed a way to slow them down as constantly pressing pause and rewind got onerous. A quick fix for this was to download Audacity. This has the function to slow down your sound files. Once the comedic effect of voice alteration has worn off, it becomes easier to transcribe as you don’t have to pause and rewind as much.

Process wise, it doesn’t sound much and it isn’t. It’s just the sheer hours of audio that needs to be made legible through listening, rewinding an typing.

How can the process be made (slightly) easier?

  • Investigate transcription technology and processes. Investigate technologies available beforehand that you can access. I wish I had done this rather than rely on the expectation that I would be just listening and typing. I didn’t find a website with the answer but a thoughtful web search can help you with certain parts of the transcription method.
  • Talk slowly. This one doesn’t apply to the transcription process but the interview process. Try and ask the questions a little bit slower than you usually would as the respondent will subconsciously mimic your speed of delivery and slow themselves down

Hang on in there, it’s worth it

Even if you choose to incorporate the suggestions above, be under absolutely no illusions: transcription is a gruelling task. That’s not a slight against the participants’ responses for they will be genuinely interesting and insightful. No, it’s a comment on the frustration of the process and sheer mental grind of getting through it. I must admit I had only come to a reasonably happy transcription method by the time I had reached number fourteen (of fifteen). However, the effort is completely worth it. In the end, I now have around 65,000 quality words (research data) to analyse to understand what existing digital skills, knowledge, ways of learning and managing change exist within my institution that can be fed into the development of digital preservation skills and knowledge.

Skills interviewing using the DPOC skills interview toolkit

Cambridge Outreach & Training Fellow, Lee, shares his experiences in skills auditing.

As I am nearing the end of my fourteenth transcription and am three months into skills interview process, now is a good time to pause and reflect. This post will look at the experience of the interview process using the DPOC digital preservation skills toolkit. this toolkit is currently under development; we are learning and improving it as we trial it at Cambridge and Oxford.

Step 1: Identify your potential participants

To understand colleagues’ use of technology and training needs, a series of interviews were arranged. We agreed that a maximum sample of 25 participants would give us plenty (perhaps too much?) of material to work with. Before invitations were sent out, a list was made up of potential participants. In building the list, a set of criteria ensured that a broad range of colleagues were captured. This criteria consisted of:

  • in what department or library do they work?
  • is there a particular bias of colleagues from a certain department or library and can this be redressed?
  • what do they do?
  • is there a suitable practitioner to manager ratio?

The criteria relies on you having a good grasp of your institution, its organisation and the people within it. If you are unsure, start asking managers and colleagues who do know your institution very well—you will learn a lot! It is also worth having a longer list than your intended maximum in case you do not get responses, or people are not available or do not wish to participate.

Step 2: Inviting your potential participants

Prior to sending out invitations, the intended participant’s managers were consulted to see if they would agree to their staff time being used in this way. This was also a good opportunity to continue awareness raising of the project as well as getting buy-in to the the interview process.

The interviews were arranged in blocks of five to make planning around other work easier.

Step 3: Interviewing

The DPOC semi-structured skills interview questions were put to the test at this step. Having developed the questions beforehand ensured I covered the necessary digital preservation skills during the interview.

Here are some tips I gained from the interview process which helped to get some great responses.

  • Offer refreshments before the interview. Advise beforehand that a generous box of chocolate biscuits will be available throughout proceeding. This also gives you an excellent chance to talk informally to your subject and put them at ease, especially if they appear nervous.
  • If using, make sure your recording equipment is working. There’s nothing worse than thinking you have fifty minutes of interview gold only to find that you’ve not pressed play or the device has run out of power. Take a second device, or if you don’t want the technological hassle, use pen(cil) and paper.
  • Start with colleagues that you know quite well. This will help you understand the flow of the questions better and they will not shy away from honest feedback.
  • Always have printed copies of interview questions. Technology almost always fails you.

My next post will be about transcribing and analysing interviews.

Outreach and Training Fellows visit CoSector, University of London

Outreach & Training Fellow, Lee, chronicles his visit with Sarah to meet CoSector’s Steph Taylor and Ed Pinsent.

On Wednesday 29 March, a date forever to be associated with the UK triggering of Article 50, Sarah and Lee met with CoSector’s Stephanie Taylor and Ed Pinsent in the spirit of co-operation. For those that don’t know, Steph and Ed are behind the award-winning Digital Preservation Training  Programme.

Russell Square was overcast but it was great to see that London was still business as usual with its hallmark traffic congestion and bus loads of sightseers lapping up the cultural hotspots. Revisiting the University of London’s Senate House is always a visual pleasure and it’s easy to see why it was home to the Ministry of Information: the building screams order and neat filing.

Senate House, University of London

Senate House, University of London. Image credit: By stevecadman – http://www.flickr.com/photos/stevecadman/56350347/, CC BY-SA 2.0, https://commons.wikimedia.org/w/index.php?curid=6400009

We were keen to speak to Steph and Ed to tell them more about the DPOC Project to date and where we were at with training developments. Similarly, we were also keen to learn about the latest developments from CoSector’s training plans and we were interested to hear that CoSector will be developing their courses into more specialist areas of digital preservation, so watch this space… (well at least, the CoSector space).

It was a useful meeting because it gave us the opportunity to get instant feedback on the way the project is working and where we could help to feed into current training and development needs. In particular, they were really interested to learn about the relationship between the project team and IT. Sarah and I feel that because we have access to two technical IT experts who are on board and happy to answer our questions—however simple they may be from an IT point of view—we feel that it is easier to understand IT issues. Similarly, we find that we have better conversations with our colleagues who are Developers and Operations IT specialists because we have a linguistic IT bridge with our technical colleagues.

It was a good learning opportunity and we hope to build upon this first meeting in the future as a part of sustainable training solution.

Training begins: personal digital archiving

Outreach & Training Fellow, Sarah, has officially begun training and capacity building with session on personal digital archiving at the Bodleian Libraries. Below Sarah shares how the first session went and shares some personal digital archiving tips.

Early Tuesday morning and the Weston Library had just opened to readers. I got to town earlier than usual, stopping to get a Melbourne-style flat white at one of my favourite local cafes – to get in me in the mood for public speaking. By 9am I was in the empty lecture theatre, fussing over cords, adjusting lighting and panicking of the fact I struggled to log in to the laptop.

At 10am, twenty-one interested faces were seated with pens at the ready; there was nothing else to do but take a deep breath and begin.

In the 1.5 hour session, I covered the DPOC project, digital preservation and personal digital archiving. The main section of the training was learning about personal digital archiving, preservation lifecycle and the best practice steps to follow to save your digital stuff!

The steps of the Personal Digital Archiving & Preservation Lifecycle are intended to help with keeping your digital files organised, findable and accessible over time. It’s not prescriptive advice, but it is a good starting point for better habits in your personal and work lives. Below are tips for every stage of the lifecycle that will help build better habits and preserve your valuable digital files.

Keep Track and Manage:

  • Know where your digital files are and what digital files you have: make a list of all of the places you keep your digital files
  • find out what is on your storage media – check the label, read the file and folder names, open the file to see the content
  • Most importantly: delete or dispose of things you no longer need.
    • This includes: things with no value, duplicates, blurry images, previous document versions (if not important) and so on.


  • Use best practice for file naming:
    • No spaces, use underscores _ and hyphens – instead
    • Put ‘Created Date’ in the file name using yyyymmdd format
    • Don’t use special characters <>,./:;'”\|[]()!@£$%^&*€#`~
    • Keep the name concise and descriptive
    • Use a version control system for drafts (e.g. yyyymmdd_documentname_v1.txt)
  • Use best practice for folder naming;
    • Concise and descriptive names
    • Use dates where possible (yyyy or yyyymmdd)
    • keep file paths short and avoid a deep hierarchy
    • Choose structures that are logical to you and to others
  • To rename large groups of image files, consider using batch rename software


  • Add important metadata directly into the body of a text document
    • creation date & version dates
    • author(s)
    • title
    • access rights & version
    • a description about the purpose or context of the document
  • Create a README.txt file of metadata for document collections
    • Be sure to list the folder names and file names to preserve the link between the metadata and the text file
    • include information about the context of the collection, dates, subjects and relevant information
    • this is a quick method for creating metadata around digital image collections
  • Embed the metadata directly in the file
  • for image and video: be sure to add subjects, location and a description of the trip or event
  • Add tags to documents and images to aid discoverability
  • Consider saving the ‘Creation Date’ in the file name, a free text field in the metadata, in the document header or in a README text file if it is important to you. In some cases transferring the file (copying to new media, uploading to cloud storage) will change the creation date and the original date will be lost. The same goes for saving as a different file type. Always test before transfer or ‘Save As’ actions or record the ‘Creation Date’ elsewhere.


  • Keep two extra backups in two geographically different locations
  • Diversify your backup storage media to protect against potential hardware faults
  • Try to save files in formats better suited to long-term access (for advice on how to choose file formats, visit Stanford University Libraries)
  • refresh your storage media every three to five years to protect against loss of hardware failure
  • do annual spot checks, including checking all backups. This will help check for any loss, corruption or damaged backups. Also consider checking all of the different file types in your collection, to ensure they are still accessible, especially if not saved in a recommended long-term file format.

Even I can admit I need better personal archiving habits. How many photographs are still on my SD cards, waiting for transfer, selection/deletion and renaming before saving in a few choice safe backup locations? The answer is: too many. 

Perhaps now that my first training session is over, I should start planning my personal side projects. I suspect clearing my backlog of SD cards is one of them.

Useful resources on personal digital archiving:

DPC Technology Watch Report, “Personal digital archiving” by Gabriela Redwine

DPC Case Note, “Personal digital preservation: Photographs and video“, by Richard Wright

Library of Congress “Personal Archiving” website, which includes guidance on preserving specific digital formats, videos and more


IDCC 2017 – data champions among us

Outreach and Training Fellow, Sarah, provides some insight into some of the themes from the recent IDCC conference in Edinburgh on the 21 – 22 February. The DPOC team also presented their first poster,”Parallel Auditing of the University of Cambridge and the University of Oxford’s Institutional Repositories,” which is available on the ‘Resource’ page.

Storm Doris waited to hit until after the main International Digital Curation Conference (IDCC) had ended, allowing for two days of great speakers. The conference focused on research data management (RDM) and sharing data. In Kevin Ashley’s wrap-up, he touched on data champions and the possibilities of data sharing as two of the many emerging themes from IDCC.

Getting researchers to commit to good data practice and then publish data for reuse is not easy. Many talks focused around training and engagement of researchers to improve their data management practice. Marta Teperek and Rosie Higman from Cambridge University Library (CUL) gave excellent talks on engaging their research community in RDM. Teperek found value in going to the community in a bottom-up, research led approach. It was time-intensive, but allowed the RDM team at CUL to understand the problems Cambridge researchers faced and address them. A top-down, policy driven approach was also used, but it has been a combination of the two that has been the most effective for CUL.

Higman went on to speak about the data champions initiative. Data champions were recruited from students, post-doctoral researchers, administrators and lecturers. What they had in common was their willingness to advocate for good RDM practices. Each of the 41 data champions was responsible for at least one training session year. While the data champions did not always do what the team expected, their advocacy for good RDM practice has been invaluable. Researchers need strong advocates to see the value in publishing their data – it is not just about complying with policy.

On day two, I heard from researcher and data champion Dr. Niamh Moore from University of Edinburgh. Moore finds that many researchers either think archiving their data is either a waste of time or are concerned about the future use of their data. As a data champion, she believes that research data is worth sharing and thinks other researchers should be asking,  ‘how can I make my data flourish?’. Moore uses Omeka to share her research data from her mid-90s project at the Clayoquot Sound peace camp called Clayoquot Lives. For Moore, benefits to sharing research data include:

  • using it as a teaching resource for undergraduates (getting them to play with data, which many do not have a chance to do);
  • public engagement impact (for Moore it was an opportunity to engage with the people previously interviewed at Clayoquot); and
  • new articles: creating new relationships and new research where she can reuse her own data in new ways or other academics can as well.

Opening up data and archiving leads to new possibilities. The closing keynote on day one discussed the possibilities of using data to improve the visitor experience for people at the British Museum. Data Scientist, Alice Daish, spoke of data as the unloved superhero. It can rescue organisations from questions and problems by providing answers, helping organisations make decisions, take actions and even provide more questions. For example, Daish has been able to wrangle and utilise data at the British Museum to learn about the most popular collection items on display (the Rosetta Stone came first!).

And Daish, like Teperek and Higman, touched on outreach as the only way to advocate for data – creating good data, sharing it, and using it to its fullest potential. And for the DPOC team, we welcome this advocacy; and we’d like to add to it and see that steps are also made to preserve this data.

Also, it was a great to talk about the work we have been doing and the next steps for the project—thanks to everyone who stopped by our poster!

Oxford Fellows (From left: Sarah, Edith, James) holding the DPOC poster out front of the appropriately named “Fellows Entrance” at the Royal College of Surgeons.