Recently Chris Keene (University of Sussex) sent an email to the LIS-E-RESOURCES email list about the fact that in academic libraries we are now doing a lot more ‘import’ and ‘export’ of records in our library management systems – bringing in bibliographic records from a variety of sources like book vendors/suppliers, e-resource systems, institutional repositories. He was looking for some shared experience and how other sites coped.
One of the responses mentioned the new ‘next generation’ search systems that some libraries have invested in, and Chris said:
“Next gen catalogues are – I think – certainly part of the solution, but only when you just want to make the records available via your local web interface.”
I sympathise with Chris, but I can’t help but think this is the point at which we have to start doing things a bit differently – so I wrote a response to the list, but thought that I’d blog a version of it as well:
I agree that library systems could usefully support much better bulk processing tools (although there are some good external tools like MarcEdit of course – and, scripting/programming tools (e.g. the MARC Perl module) if you have people who can programme them. However, I'd suggest that we need to change the way with think about recording and distributing information about our resources, especially in the light of investment in separate 'search' products such as Aquabrowser, Primo, Encore, Endeca, &c. &c.
If we consider the whole workflow here, it seems to me that as soon as you have a separate search interface the role of the 'library system' needs to be questioned – what are you using it for, and why? I'm not sure funnelling resources into it so they can then be exported to another system is really very sensible (although I absolutely understand why you end up doing it).
I think that once you are pushing stuff into Aquabrowser (taking Sussex as an example) there is little point in also pushing them into the catalogue – what extra value does this add? For books (print or electronic) you may continue to order them via the library system – but you only need an order record in there, not anything more substantial – you can put the 'substantial' record into Aquabrowser. The library system web interface will still handle item level information and actions (reservations/holds etc.) – but again, you don't need a substantial bib record for these to work – the user has done the 'searching' in the search system.
For the ejournals you could push directly from SFX into Aquabrowser – why push via the library system? Similarly for repositories – it really is just creating work to covert these into MARC (probably from DC) to get them into your library system, to then export for Aquabrowser (which seems to speak OAI anyway).
One of your issues is that you still need to put stuff into your library system, as this feeds other places – for example at Imperial we send our records to CURL/COPAC as well as other places – but this is a poor argument going forward – how long before we see COPAC change the way it works to take advantage of different search technology (MIMAS have just licensed the Autonomy search product …). Anyway – we need to work with those consuming our records to work out more sensible solutions in the current environment.
I'd suggest what we really need to think about is a common 'publication' platform – a way of all of our systems outputting records in a way that can then be easily accessed by a variety of search products – whether our own local ones, remote union ones, or even ones run by individual users. I'd go further and argue that platform already exists – it is the web! If each of your systems published each record as a 'web page' (either containing structured data, or even serving an alternative version of the record depending on whether a human or machine is asking for the resource – as described in Cool URIs), then other systems could consume this to build search indexes – and you've always got Google of course… I note that Aquabrowser supports web crawling – could it cope with some extra structured data in the web pages (e.g. RDFa)?
I have to admit that I may be over estimating how simple this would be – but it definitely seems to me this is the way to go – we need to adapt our systems to work with the web, and we need to start now.