Archive this! UK Web Archive Mini Conference – November 4, 2019

I recently attended this event at the British Library (BL) in London, in which the small team of BL employees who work on the UK Web Archive (UKWA) spoke about their work to staff from the UK’s other legal deposit libraries. The UKWA is a legal deposit programme and much of the content it archives is only accessible from legal deposit library reading rooms. The purpose of the mini-conference was to better inform reading room staff at these institutions – which includes my employer, the Bodleian Libraries – in order that they can guide their readers to it as a resource for research.

The BL promoted the event with the following messages:

The UKWA gives researchers access to a new and largely untapped resource from which to conduct research. It will be demonstrated that the UKWA provides a unique window into all aspects of our lives, whether they be cultural, economic, political or social. The UKWA can also show the changing ways in which information is disseminated, and how websites have transformed over time by providing access to this important archive in perpetuity. Staff will learn how to contribute to the UKWA themselves to ensure websites are captured for posterity as well as be able to support researchers who are interested in adding to the UKWA.

The day was structured into presentations from guest researchers and BL staff working on UKWA, including its web archive engagement manager and curators. Professor Jane Winters (@jfwinters), professor of digital humanities at the School of Advanced Study, University of London, opened the conference with a welcome talk emphasising how much historians will need archives of digital information if we are not to move into a new dark age as relatively long-lived printed documents give way to more ephemeral digital ones. This gives a stark impression of the scale of the task.

Jane was followed by a brief introduction to the UKWA and its web interface by Jason Webber (@jasonmarkwebber) of the BL, whose job it is to promote the BL’s web archiving programme. There are two ways of accessing archived data: provides five datasets that are optimized for quantitative analysis; for a qualitative approach, it’s possible to search and browse the archive at

Amongst Jason’s key messages are that they can only collect a “representative sample of the UK web space,” and that only major news websites are collected daily, with all other sites archived considerably less frequently. The ‘Topics and Themes’ displayed on the UKWA homepage are the easiest way into focused curated collections of material on specific topics. General searching is possible but, rather like Google, is likely to return far too many results for anyone to be able to sift through manually.

Next, Nicola Bingham (@NicolaJBingham) and Helena Byrne (@HBee2015), both web archive curators at the BL, talked about the wider team members working on the UKWA and the tasks they carry out to make it all work. The various legal deposit libraries curate collections on particular topics, but this is balanced against a generally undiscriminating approach to archiving public sites, meaning that they are not filtered for quality. Anyone can nominate any UK website to be added from a link on the UKWA homepage It doesn’t have to have a UK domain or be hosted in the UK as long as it is demonstrably UK content.

During the lunch break there was an opportunity to submit a webpage for inclusion using the ‘nominate’ form.

Jason then introduced two PhD students who are making use of the UKWA as an information source in their research. Liam Markey (@Liam_Markey94) is based at the University of Liverpool and is researching the concept of militarism, particularly with respect to how the First World War is commemorated. He has found the UKWA to be a useful source of informally written and published material that nevertheless gives insights into how particular words are used in association with the remembrance of war. Following Liam, Hannah Connell (@HannahfConnell), of King’s College London, described how the UKWA has helped in her study of émigré publishing in Russian in the 20C. Of course, blogging, or otherwise writing on the web, is a form of publishing and, therefore, researchers like Hannah would be missing relevant material if they did not search the web, including looking for archived content that is no longer available.

After these case studies, Jason took the floor again to discuss the challenges and opportunities of web archiving. Are we really catching a representative sample? Personal sites such as WordPress blogs are underrepresented in the archive. And much online content is moving away from the open website format, eg, social media accounts behind logins and the general movement towards apps on mobile devices rather than browsers. These are all beyond reach. Even where content is captured, opportunities for ‘big data’ analysis are limited both by staffing levels and legal deposit restrictions (eg, access on a legal deposit library site only). Researchers wanting to cite archived web content also hit obstacles thanks to these legal deposit restrictions (ie, they cannot provide a valid link to content that is only accessible in a handful of reading rooms). Overall, very few people are working on web archiving globally, particularly with technical skills.

“We’re not quite history yet,” Webber states, referring to the still relatively recent date of the earliest archived web content. As Professor Winters suggested in her introduction, interest in and use of web archiving is expected to rise. “There is a great future ahead of us.”

To wrap up the day, the BL’s Dr Richard Price thanked all contributors and put the importance of the project into context for society as a whole: “Public witness is fundamental.” We will always need to know who said what and when in the public sphere.

To keep up-to-date follow the @UKWebArchive blog here:

Website is coming

HTML screenshot

…just a bit of formatting and a couple of hundred footnote links to sort out. However, all of the text has been exported from InDesign, the wireframing done, and the essential HTML and CSS worked out (thanks

I’m using Notepad++ on Windows and DreamWeaver on MacOS to write and edit the .html text files but all of the coding is by hand (OK, there’s some copying and pasting).

And best of all, it’s very satisfying work and actually a lot fun!

Watch this space…

New roles, new skills

It’s been a good while since I posted anything here and the reason is, I’ve been busy. After six months working in a new role I’ve now passed my probation and am all set to work out the remaining 18 months of my contract. I’m still at Oxford but, rather than being at a College, am based in the Bodleian’s library service at the John Radcliffe hospital. Bodleian Health Care Libraries (BHCL) is a group of four sites; one in each of the Oxford University Hospitals NHS Foundation Trust’s sites. The Cairns Library at the JR is the largest and serves all Trust employees, the Medical Sciences Division (MSD) of the University, and Oxford Brookes students on health care courses. I am employed as a grade 4 senior library assistant with particular responsibility for enquiries and outreach.

Before I list some of the things I’m doing it’s satisfying to be able to make a connection between my last job and this one. The Cairns Library is, of course, named after Sir Hugh Cairns – the pioneering neurosurgeon who ran the wartime head hospital on the site of St Hugh’s College, my previous employer, after having helped found the University of Oxford Medical School.

Whilst my role at St Hugh’s was very book-oriented (and what a beautiful and rich collection it is), much of my time is now focused on the discovery of online resources; in particular, medical journal articles. BHCL has a team of outreach librarians, each attached to departments across the Trust and MSD, thereby specialising in those particular fields of practice and research. My job is to help them by fielding initial enquiries and either handling them myself or passing them on to my colleagues. For example, I can give a reader advice on how to go about a literature search but my colleagues will conduct a full reference interview and carry out a search on their behalf. Many practitioners take part in some form of continuing professional development and there is a steady demand on our services from a variety of people – doctors, nurses and researchers. To this end we also provide a range of information skills sessions, which I also support.

While all of this is going on I am also part of a team running a library service. I do a fair amount of hours on our reception desk (mostly registrations) and help desk (general enquiries, how do I do XYZ?). This is a role that is widely shared throughout the team, from grade 2 to grade 10, which has the very positive effect of giving us a shared and relatable work experience, no matter our seniority. Back at my own desk I spend a lot of time on creative marketing tasks. I’m part of two different teams: one redeveloping some of our webpages, with everything that entails, including information architecture, some basic coding and plenty of CMS work, LibGuide and image editing; and another tasked with developing new ways to promote our services, particularly throughout the hospital. I still do plenty of what I’ve been doing for years: leaflets, posters and presentations.

I’m very happy to be finally poking around with HTML after years of working in marketing but never having had to tackle it directly. I’m also continuing the social media side of things that I had been responsible for at St Hugh’s by taking on the role of web and social media officer for CILIP in the Thames Valley (a group I first mentioned way back in my second post on this blog). All of this web stuff is priming me for that website of my dissertation I posted about a year ago. It is still a work in progress. Honest! In the meantime it can be found here.

Mentor exchange event

This post was written for CILIP in the Thames Valley and was first published on their blog

A ‘Mentor Exchange’ event took place at RISC in Reading on Wednesday 31 May, led by local mentor support officer (MSO) Linda Jones, whose professional home is the University of Portsmouth. Very much a ‘round table’ event (though without a table), attendees gathered to hear from Linda about CILIP’s move towards training and supporting mentors via the VLE (Virtual Learning Environment) as opposed to the classroom-based training of the past, and to guide prospective mentors in what to expect from the experience, of training and mentorship in general.

The event started with each attendee describing who they were and what their situation was. The group included everyone from recently Chartered librarians interested in becoming mentors, to a retired colleague who sits on the CILIP Board and has experience evaluating professional registration portfolios. A new mentor shared her first impressions of mentorship, including to “remember that the work comes from them.” She added that there is no need to talk a great deal, but when doing so, to ask questions. This nicely set the scene for an exercise from Linda for all attendees, more of which later. The attendee concluded that being a mentor was “not as scary” as she thought it would be.

The perspective of a person sitting on the CILIP Board – and effectively being a mentor of mentors – was provided next. The main lessons here were of the benefits of mentoring someone who works in a different branch of information management to the one in which the mentor gained most of his/her experience.  Not only does this make the experience more interesting for the mentor, it also encourages the mentee to consider their work objectively, which helps them to place their activities in a wider context – in the words of the attendee, “to look outside themselves.” But the evidence they provide in their portfolios must be appropriate to their fields.

The next question from Linda to go round was what are, or what would attendees imagine to be, the most positive and negative sides of mentoring? Speaking of herself Linda admitted that she thrives on working with people from different sectors, particularly because she enjoys seeing as many libraries as she can get into, believing as she does that they are vital to civilisation. On the negative side, she regrets seeing mentees “driving down the wrong road for far too long,” developing material that doesn’t fit in their portfolios. Overall, she’s had fantastic experiences as a mentor, whether they’ve stayed in touch or have happily gone on their way after the experience of professional registration.

Two attendees who attained Chartered status relatively recently and are interested in becoming mentors were concerned that, in doing so, they would be able to provide the same quality of mentorship that they had enjoyed as mentees, whether this is guidance in general about professional practices or specifically following CILIP’s procedures. The more experienced attendees reassured them that all mentors have access to an MSO (in our case, Linda), and that curiosity is enough when working with someone from a different sector. The question of the risk of mentoring someone who goes on to fail to achieve professional registration was also aired. The reply here was to remember that it is always the mentee’s portfolio and that failure is their failure, rather than that of the mentor. In these cases mentors can work with the feedback provided by the assessment board.

Another attendee was concerned about being able to convince her line manager to let her be a mentor, to which the group agreed that the best approach was to find the right words to demonstrate the benefits to her employer of doing so.

Move towards online training
Next Linda explained recent changes in CILIP’s mentor training. MSOs’ recent experiences are that it is increasingly difficult to get would-be mentors and trainers into the same room for a full day. CILIP’s answer is to virtualise the process via the VLE, which now has a separate mentoring section. What would have been communicated in one day’s worth of intensive classroom training is now to be imparted over four weeks online. To be clear, the window available is four weeks but the amount of work is the same as can be achieved in a day, i.e. seven hours. The four weeks are structured, however. There are four weekly units that must be done in four consecutive weeks. The second week consists of a practical exercise conducted with another candidate. Two MSOs also follow the course (Linda was keen to point out that this new process is also new to them).

Those worried about using the VLE should be reassured that it is much easier to use than it was a year ago and that the course is well-structured. Discussion boards replace face-to-face contact and Linda encourages everyone to chat, get to know people, and explore. She commented that she has also asked that current mentors be allowed to do the training, to bring them up-to-speed with the new interface. Sections of the course are made available as one progresses through it. Much of the information is provided in the form of video. It is possible to download course frameworks for those wanting to study them offline.

This part of the meeting ended with a discussion about Certification and how it relates to the NVQ in library and information studies, and how both might relate to the development of an apprenticeship scheme for our sector.

A fun exercise to end
To round off the meeting Linda handed out plain envelopes to everyone, who split into pairs for an exercise on closed and open questions. Each envelope contained a picture of an animal. The object was for each attendee to find out what animal their counterpart had by asking as many closed questions as they liked, followed by a single open question. Closed questions were defined as ones in which there is a binary answer, meaning one of two options – typically yes or no. An open question, by contrast, is one that might have any answer.

In practice, the exercise was good fun but it also clearly illustrated the power of open questions and the limits of closed ones. For a mentor/mentee conversation, open questions are always to be preferred because they encourage the mentee to provide information, which requires them to think about their situation objectively, in order to explain it to their mentor, and to examine their own feelings.

In conclusion
Linda brought the meeting to a close with the some useful advice. Mentoring doesn’t have to be face-to-face, but it’s worth remembering that what works for oral communication isn’t always appropriate when written, so make sure your tone is right in any written communication. Mentoring can give you skills to deal with difficult people at work. Mentors can say no to any request from mentees.

The final piece of advice came from the CILIP Board member attending, with which Linda wholeheartedly agreed: mentees must express their own professional voices in their portfolios; the test, particularly for Chartership, is that they demonstrate their own initiative.

Working on a website

It’s been a while since I’ve posted but I’ve been busy with that full-time job I kept banging on about. Still doing the reclassification (2700 items and counting); still not started Chartership.

Since submitting my dissertation over a year ago I’ve thought a few times about making it available online (OK, “publishing” it), so I hereby announce my intention to develop an HTML version of it.

Readers of this blog will be the first to know when there’s something to see…

GS1-128_GTIN_Batch B.png

Re- classification

A major ongoing background project at St Hugh’s College library is that of reclassification or, more specifically, changing the shelfmarks on older stock according to the latest edition of the Dewey Decimal Classification (DDC) in line with the bespoke requirements of the library. These requirements are based on extensive consultation with the College’s Fellows in each subject area, and on a more general assessment of the needs of members as a whole.

Published by the OCLC , but edited by a team at the Library of Congress, DDC is currently in its 23rd edition. New editions are published every seven years or so, when trends in the books being published at the time press the editors to revise the classifications in order to better match the needs of DDC’s users.

Reclassifying large collections is an onerous task and no library can afford to implement a new DDC Edition simply because it is available. However, one of the practical consequences of a DDC Edition changing both numbers and the precise subject definitions they relate to means that, over time, new acquisitions are classified and shelved somewhere other than the books a library already holds on those subjects.

So, reviewing shelfmarks on older stock gives an opportunity to re-sort the collection so that older and newer material that would otherwise be shelved apart can be collocated. It also allows for old labels or handwritten codes to be replaced with standardised, easy-to-read labels, making finding books or determining where they should go back on to the shelves that much easier for readers and staff-members alike.

One of the customisations adopted at St Hugh’s is that numbers will be no longer than three digits after the point, that the final digit must not be zero, and that the number will be followed by a three-letter code. This determins that items with the same numbers be shelved alphabetically. The code is the first three letters of the lead author’s surname or, in the case of edited works, the first three letters of the first keyword in the title, which ensures that if future editions are edited by different people at least they will be shelved together. For example:

  • 111 GOS (subject: Ontology; author: Gosden; title: Social being and time)
  • 409 KAR (subject: Incidence of and public measures to prevent disease > History, geographic treatment, biography; author: Karlen; title: Man and microbes)
  • 51 BEH (subject: General topics in behaviour; editor: Krebs; title: Behavioural ecology)

My first contribution to this project has been to reclassify our anthropology/archaeology books that had the shelfmark 572, which is now used for biochemistry. Before the books can be relabelled and redistributed to their new locations each must go through a review process to determine the most appropriate shelfmark for it according to the criteria now applied. Sources for this include what other Oxford libraries that use Dewey (not many) use if they have the same items, what the colophon says (Library of Congress cataloguing-in-publication data included behind the title page of many US books states the Dewey number and the edition current at the time of cataloguing), and looking up via their respective web catalogues what the Library of Congress and the British Library have on their records. Likely candidate numbers are then checked in the online version of DDC 23 (“WebDewey”) and a table constructed of titles, with their proposed numbers and the sources for these recommendations.

Only once a whole number section (such as 572, excluding newer items already catalogued to the current criteria) has been examined and the proposed numbers agreed, does the fun of relabelling and moving each book begin. The items under 572 were dispersed far and wide, mostly to 300-309 but also to the: 110s, 150s, 170s, 200s, 210s, 390s, 550s, 570s, 590s, 610s, 700s, 720s and the 990s.

On reflection, it is a very satisfying procedure to make an intellectual judgement about how a book should be classified, enact it, and then give that book a new home, amongst other titles that were not previously its neighbours. This illustrates very clearly how much we, as readers and librarians, assume about a book from where it is shelved, and the other books that surround it.

Librarian and author Outi Pickering talks about her debut novel ‘Two Point Five Cheers for the Library’

Here’s something I wrote about a recent CILIP in the Thames Valley meeting…


On Wednesday 2nd March we enjoyed a presentation from the local librarian-cum-author, at The King’s Arms in Oxford. Thanks to Matthew Henry for writing this blog post.

Outi introduced her presentation by explaining her background. She is Finnish and moved to England just before the turn of the millennium after completing an MLIS at Åbo Akademi. She had previously worked for a library supplier and in adult education, and had studied English and Italian for her first degree, specialising in Scottish literature. She also admitted to having been writing from childhood.

Since 2000 Outi has been assistant librarian at Oxford Health NHS Foundation Trust’s Warneford Hospital in Headington and it is, perhaps, this environment that has provided a rich source of inspiration for her book, which describes a service forever under pressure from “the powers that be” yet is always being asked to do more with less while accommodating constant user-survey feedback.

To set the scene, the fictional library has recently – and accidentally, of course – been relocated into a broom cupboard (literally) and is staffed by characters such as Simon Pendrive, Mimosa Macaroon, Vladimir Logoff and Claire Twinsett, suffering from syndromes such as COCRAN (coffee and chocolate-related aggressive nervousness). My personal favourite is Fiona Fatica (Outi’s Italian coming in handy there).

The NHS Trust in question is, naturally, in Cardigan Bay. Without giving too much away, the book is chock-full of clever coinages (‘health-scare library’, ‘nil by mouth’, ‘only buy a book you can classify easily’) and scenarios that would have anyone who’s ever worked in an office environment chuckling with recognition. For librarians, it’s almost too close to the bone.

To finish, Outi read us a preview of the next installment. It wouldn’t be surprising if some enterprising TV producer decided to make Two Point Five Cheers for the Library into the next The Office, so get on-board before it takes off.

Two Point Five Cheers for the Library is published by Olympia Publishers, ISBN 9781848976146. £8.99 in paperback.

Matthew Henry, library assistant at St Hugh’s College, Oxford. Twitter: @matthew1001001

Improving discovery tools, archiving data sets, and collecting non-print Legal Deposit

Seven weeks into being library assistant at St Hugh’s and, if you’ve been following any of my social media posts [see the previous blog post for links] then you’ll know I spend a lot of time with books. One of the great things about the way libraries at Oxford University work is that the Bodleian Libraries are the superstructure that holds everything together.

In the case of the Colleges, they provide the union catalogue SOLO  and online database and journal subscriptions. Library staff at the colleges have access to the wide range of training courses that the Bod runs. So far I’ve done sessions on bibliographic records (a prelude to basic cataloguing), social media for libraries, and reference management software.

We also get invited to “All-Staff” meetings, to get the inside story from Bodley’s Librarian Richard Ovenden about what’s going on behind the scenes. In this post I wanted to mention three examples of digital library projects that might be of interest to Citylis students currently on the Digital Libraries module. They are:


1) Resource Discovery

A major investigative project by the University, sponsored and managed by the Bodleian, the intention is to build something “which will make the task of searching through the riches of the University’s intellectual assets easier and bring them to greater prominence online.” And we all know that includes bibliographic records.

An extensive 88-page report  has been compiled to “scope the work needed to develop an intelligent search and retrieval tool or tools.”

The next stage is using collection-level metadata to visualise the extent of Oxford collections in an “interactive diagram”. We often hear that libraries have excellent metadata but that these data are difficult to obtain, and certainly link to, in an increasingly networked environment. The Resource Discovery project is one institution’s attempt to remedy that and integrate it with indexes of metadata from other sources, with the aim of achieving a step-change in accessibility and, thereby, usability.


2) ORA-Data

The Oxford Research Archive (ORA) is the University’s established repository of theses and publications but its capability has recently been extended to manage datasets. The ORA-Data was launched in early 2015 and is one of a suite of support services designed to help researchers access, create, archive, share, and cite research data. It is also built to hold catalogue records of data archived at subject specialist repositories. More information at these links:

Data Archiving (ORA-Data)



3) Electronic Legal Deposit (eLD)

Finally, the end of 2015 saw over one million electronic articles and more than 41,000 e-books deposited at UK Legal Deposit libraries and made available through SOLO. The focus now moves towards digital maps, then official papers, digital sheet music, grey literature (behind paywalls) and other emerging formats.

More information here:

Electronic/Non-print Legal Deposit [scroll down to this heading]


And that really is a fraction of the digital landscape at the Bodleian. Go and explore!

And so it begins…

Last week I started my new job at St Hugh’s – a full-time post, which makes it my first ‘proper’ job since being made redundant in 2012. I am sorry to leave Reading Libraries, especially as they undergo a massive funding cut this year, but I owe it to myself and my family to continue to work towards my goal of Chartered Librarian-status, with a job to match. The St Hugh’s position will go a along way towards helping me to achieve that. St Hugh’s library – the Howard Piper – is one of the larger College libraries with some 80,000 volumes – all run by the librarian, and me.

I have only been there for three days, and things will likely change as students return next week, but it’s already clear that it’s a very different institution from Reading Central, with less need to directly help readers, meaning more opportunity to run and develop the library. For example, a lot of time at Reading was spent scanning books in and out and joining members. St Hugh’s ‘pre-joined’ readers and the stock’s RFID tags means that these activities are far less time-consuming.

One of the things I’ll be handling is social media, so keep an eye on St Hugh’s Library Twitter or Facebook, and Instagram. I’ll also be prepping new books, attending to the reading rooms, shelving, and developing new branded promotional materials.

I really cannot express how delighted I am to be given the opportunity to work in an academic library at such a prestigious institution. I intend to blog about what I’m doing and, when I’m settled in, starting Chartership. I also hope to write about other libraries in Oxford that I’m entitled to visit – there are 97 in all across the University.

What an amazing start to 2016 and a vindication of my decision to retrain for information work by going to City University and volunteering at Reading. Thanks to everyone who’s helped me, including my parents, but particularly my wife Enza. She has kept everything going while I’ve been finding my feet and I will be forever grateful.