Upscaling digitisation at the Wellcome Library

Wellcome library – part of Wellcome trust, a charitable foundation which funds research and includes research/contextualisation etc of medical history

Wellcome library has a lot of unique content – which is the focus of their digitisation efforts. Story so far:

Image library created from transparencies/prints – and on demand photography – 300,000 images
Journal backfiles digitisations
Wellcome Filme – 500+ titles
AIDS poster projects
Arabic manuscripts – 500 manuscripts (probably biggest single project)
17th Century recipe books

Contribute to Europeana

Digitisation part of longterm strategy for the library – but while aim is to eventually digitise everything, need target content.

Digitisation archival material, around 2000 books 1850-1990 (pilot project – and of course will test waters in copyright areas). Also contributing to Early European Books project – commercial partnership with ProQuest.

Approach to digitisation projects has changed. Previously did smaller (<10,000 pages) projects, relatively ad hoc, entirely open access, library centric, no major IT investment – but now doing large project (>100,000 pages) with involvement from wider range of stakeholders – within and outside organisation, needs major IT development. Also increasing commercial partnerships mean not all outputs will be ‘open access’ – although feel that this is about additional material that would not have been done otherwise…

Need to move

  • Manual processes -> Automated processes (where possible)
  • Centralised conservation -> distributed conservation
  • Low QA -> increased QA, error minimization
  • Using TIFF -> JPEG 2000 (now 100% JPEG 2000 after digital copy created)
  • From detailed and painstaking to streamlined and pragmatic

Streamlining:

  • Staff dedicated to specific projects or streams of work
  • Carry out sample workflow tests for new types of material
  • Right equipment for right job – eliminate the ‘fiddly bits’ – led to:
  • Live-view monitors
  • Easy-clean surfaces
  • Foot-pedals
  • Photographers do the photography
  • Prepare materials separately
  • Leave loose pages and bindings as they are – easier to digitise that way
  • Use existing staff as support
  • Minimise movement
  • Plenty of shelving and working space
  • Find preferred supplier for ad hoc support

Upscaling and streamlining digitisation requires a higher level of project management

Goobi http://www.goobi.org/:
Web-based workflow system
Open source (core system)
Use by many libraries in Germany
Wellcome use the Intranda version (Intranda a company who do develop Goobi)

Goobi is task-facuse, customisable workflows – developed specifically by Intranda
User-specific dashboard
Import/export and store metadata
Encode data as METS
Display progress of tasks, stats on activities
tracks projects, batches and unit
Can call other systems – e.g. ingest or OCR

Q: Is Goobi scalable? Can it be used for very big projects
A: Goobi works well for small institutions – don’t need programmers to implement and relatively cheap. But probably scalability going to be limited by hardware rather than anything else

Q: How does Intranda version differ to other version of Goobi
A: at least at Wellcome … e.g Goobi doesn’t handle ‘batches’ of material – Intranda added this material. Goobi uses Z39.50 to get metadata, Wellcome wanted to get metadata elsewhere, so adjusted to do that by Intranda

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.