R.I.P.ositories

As Universities are investing more in building/supporting/maintaining Digital Object repositories – specifically for OA purposes – I’m also starting to see quite a bit of ‘anti-repository’ sentiment on various blogs.

Perhaps the earliest post I saw in this area was The importance of being open by Andy Powell, in which he pondered the difference between the JISC IE approach and the Web 2.0 approach. For those not familiar with the JISC IE approach, I’d recommend looking at Andy’s presentation on Item Banks and the JISC IE specifically slide 8.
I made a comment in response to Andy’s post suggesting that "the IE encourages the idea of a closed community rather than integrating with the wider world" – and looking at the post and IE diagram again, I think this is a definite problem – note that the ‘presentation’ layer of the IE diagram doesn’t have a straightforward ‘open web’ or similar – all via portals or systems.

Interestingly as early as 2003, Clifford Lynch said "a university-based institutional repository is a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members" and also noted "An institutional repository is not simply a fixed set of software and hardware" (ARL : A Bimonthly Report, no. 226 February 2003) – but much of the activity around repositories seems to have focussed around the software options rather than the services.

In a later post (The R Word) Andy says "The important point, at least as far as open access is concerned, is not that such papers are deposited into a repository but that they are made freely available on the Web." – this starts to get to the heart of the matter I think. In the post The Repository Roadmap – are we headed in the right direction? Andy takes this further, saying "One of the things I want to argue in the presentation (though I know that this is something that Rachel, my roadmap co-author, strongly disagrees with) is that, from the perspective of consumers, repositories are just Web sites.  Somehow, it almost feels like heresy to say so – I don’t know why!?"

At the CRIG Unconference Paul Walk from UKOLN was quoted as saying “Wouldn’t it be great if the outcome of this unconference was that repositories were just wrong?” and although he made it clear that this was "a sarcastic response" (Repositories get my vote), the remark obviously struck a chord with more than one participant.

To bring us up to date Andy blogged about his presentation at the VALA 2008 conference (Repositories thru the looking glass), pushing three key points – which I paraphrase here:

  • "…our current preoccupation with the building and filling of ‘repositories’ (particularly ‘institutional repositories’) rather than the act of surfacing scholarly material on the Web means that we are focusing on the means rather than the end (open access)…"
  • "…our focus on the ‘institution’ as the home of repository services is not aligned with the social networks used by scholars, meaning that we will find it very difficult to build tools that are compelling to those people we want to use them…"
  • "…the ‘service oriented’ approaches that we have tended to adopt in standards like the OAI-PMH, SRW/SRU and OpenURL sit uncomfortably with the ‘resource oriented’ approach of the Web architecture and the Semantic Web…"

Paul Walk responds to this – with much agreement and some slight disagreements, and there have been quite a few twitter’s around supporting Andy’s basic points. I’m not altogether I quite get the differentiation between ‘service oriented’ and ‘resource oriented’, but luckily Andy promises to expand on this at some point…

So – what’s my take, as Project Director for the Digital Repository at Imperial College (Spiral) (which we are formally launching next month)?

Well, firstly it’s ironic that the popular repository software was built with OAI in mind, and this was (at least partially) intended to make the ‘invisible web‘ available to search engines. The problem with this approach seems to have been to build a secondary web of data sources which talk to each other via protocols not widely adopted outside the immediate community.
I suspect that within the repository community there is some tendency to think in a silo mentality – a repository is a container you put stuff in, and it leads you to think that you need an ‘interface’ for everyone to access the stuff in the container.

On the otherhand, the criticism seems to overlook some of the work done in providing interfaces that are more ‘native web’ than this – for example, the e-prints.org call for plug-ins – to do things like expose repository entries as RSS feeds etc.
I believe that one of the problems is that repositories aren’t (usually) just ‘webpages’ – they contain what is essentially ‘print’ material in an electronic format. Flickr has been used as an example of a ‘repository’ which integrates more happily with the web – but I’m not convinced it has been any more successful at exposing the material in Flickr than say arXiv is for scholarly papers (some very quick Google searches for images (image search or standard google search) seem to indicate that Flickr doesn’t feature particularly heavily in the results) – Amazon is more successful, and Wikipedia is particularly successful at getting into Google search results of course – but certainly the latter is ‘born digital’ and part of the warp and weft of the web rather than discrete documents being ‘added’ into the web. This needs further thought – is it possible to really ‘integrate’ pdfs into the web? Perhaps we need a shift to true ‘born digital’ publishing for this to be successful.

I believe (and I’m pretty sure that here I’m violently agreeing with at least much of the previous writing), we need to focus on exposing repository content to the web. It may be the systems we currently use don’t aid this enough, but it’s as much an attitude as anything else. At Imperial I’m keen that we see the academic’s ‘Professional Web Pages‘ as the route to the repository (in the near future there will be links from these pages to the full text articles in the repository where available). We’ve also gone as far as we can to ‘hide’ the repository from the users – ideally the academics don’t need to know we have a repository, they are just ‘uploading’ their full-text papers.
At the same time, what we need is a system that helps us administer the workflow around the delivery of digital objects in a corporate environment, but that is invisible to those not involved in the administration – and that’s what I want out of a ‘repository’ – so, for me, the Repository is dead, long live the repository.

Blogged with Flock

6 thoughts on “R.I.P.ositories

  1. I’ve just come across this post. I have to comment on the issue of the IE being closed. Your comment:”the IE encourages the idea of a closed community rather than integrating with the wider world”. I have always interpreted the IE diagram as a pretty high level representation of different sets of services within an infrastructure ( that should work with the wider web ). The boundaries between the different layers have always been blurred (I think they are even more so now we see web service approaches). The presentation layer in my view has always been about all different types of views on to the content and services. So subject portal, web search engine, library management system, personal learning environment. The ‘portal’ type services that have been developed through JISC IE funding have always been encouraged to make their services available in flexible ways so a few years back JSR168, SRU/W and then WSRP, REST.
    I guess you do say ‘encourages the idea’; I concede that. We needed [need] to make the more open nature clearer.

  2. I’ve just come across this post. I have to comment on the issue of the IE being closed. Your comment:”the IE encourages the idea of a closed community rather than integrating with the wider world”. I have always interpreted the IE diagram as a pretty high level representation of different sets of services within an infrastructure ( that should work with the wider web ). The boundaries between the different layers have always been blurred (I think they are even more so now we see web service approaches). The presentation layer in my view has always been about all different types of views on to the content and services. So subject portal, web search engine, library management system, personal learning environment. The ‘portal’ type services that have been developed through JISC IE funding have always been encouraged to make their services available in flexible ways so a few years back JSR168, SRU/W and then WSRP, REST.
    I guess you do say ‘encourages the idea’; I concede that. We needed [need] to make the more open nature clearer.

  3. I’ve just come across this post. I have to comment on the issue of the IE being closed. Your comment:”the IE encourages the idea of a closed community rather than integrating with the wider world”. I have always interpreted the IE diagram as a pretty high level representation of different sets of services within an infrastructure ( that should work with the wider web ). The boundaries between the different layers have always been blurred (I think they are even more so now we see web service approaches). The presentation layer in my view has always been about all different types of views on to the content and services. So subject portal, web search engine, library management system, personal learning environment. The ‘portal’ type services that have been developed through JISC IE funding have always been encouraged to make their services available in flexible ways so a few years back JSR168, SRU/W and then WSRP, REST.
    I guess you do say ‘encourages the idea’; I concede that. We needed [need] to make the more open nature clearer.

  4. Thanks for leaving the comment Rachel. I guess I meant that the IE diagram isn’t explicit enough in how the IE engages with the wider web.
    I agree with much of your comment. However, I think as a community we often are a bit inward looking. To take a couple of examples:
    Many library catalogues have been available on the web for 10+ years – but most output as unstructured html. How many output RSS as a standard results format? How many provide a publicly accessible RESTful API? While most catalogues support z39.50, few have adopted SRU/W.
    JSR168 – I theory this is a great idea – but try to integrate any HE based service with Netvibes, Pageflakes, iGoogle etc. We are talking a different language to them.
    It is perhaps too easy to be negative about the situation, and I argue in my post that things are actually more positive than others portray. However, I worry that OAI-PMH is the z39.50 of the repository world – a great idea, but never going to be adopted by the majority.

  5. Thanks for leaving the comment Rachel. I guess I meant that the IE diagram isn’t explicit enough in how the IE engages with the wider web.
    I agree with much of your comment. However, I think as a community we often are a bit inward looking. To take a couple of examples:
    Many library catalogues have been available on the web for 10+ years – but most output as unstructured html. How many output RSS as a standard results format? How many provide a publicly accessible RESTful API? While most catalogues support z39.50, few have adopted SRU/W.
    JSR168 – I theory this is a great idea – but try to integrate any HE based service with Netvibes, Pageflakes, iGoogle etc. We are talking a different language to them.
    It is perhaps too easy to be negative about the situation, and I argue in my post that things are actually more positive than others portray. However, I worry that OAI-PMH is the z39.50 of the repository world – a great idea, but never going to be adopted by the majority.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.