Journals and Repositories: an evolving relationship

This first session is the keynote, by Stephen Pinfield.

Stephen is going to give some background and definitions, then look at three different models of interaction between journals and repositories, finally looking at issues around implementation of these models.

Stephen takes us back to the Budapest initative which identified two routes to ‘Open Access’:

OA Repositories (a.k.a. ‘green’)
OA Journals (a.k.a. ‘gold’)

Stephen is saying that these have sometimes been seen as ‘competitive’ – perhaps mutually exclusive? – but we are now seeing these as complimentary, or even overlapping, approaches.

Stephen breaking down some terminology:

Repository: a set of systems and services which facilitates the ingest, storage, management, retrieval, display and reuse of digital objects
Journal: a collection of quality-assured articles normally within a defined subject area, made available at regular intervals under a single ongoing title’ (n.b. quality-assured in academic context is usually peer-review) – I’m suprised no mention of ‘edited’ here – Stephen saying brand is key, journal titles are a ‘brand’
Open Access: where the full content is freely, immediately and permanently available and can be accessed and reused in an unrestricted way. Stephen stresses the ‘timeliness’ of OA as a key point

Stephen now moving onto 3 possible models for repositories and journals interacting:

Repository -> Journal
Journal -> Repository
Repository -> Overlay Journal

1. Repository -> Journal

Stephen describing possible workflow associated with this:

Author writes paper, and submits to journal for publication and puts pre-print in repository (which is immediately available)
Paper is peer-reviewed by journal, revised, and the author submits final version to the journal, and to the repository (post-print)
Journal publisher edits and formats paper, and publishes

In general the ‘repository copy’ is made available before the journal copy. Stephen notes that the pre-print doesn’t have to be deposited, but in typical scenario for this model (arXiv) this is what happens.

There is an assumption that once the paper has been formally published in a journal, usage switches from the post-print (post peer-review that is) copy in the repository, to the journal copy. Study by Henneken et al (http://arxiv.org/abs/cs/0609126) shows that this is what happens (in astronomy at any rate).

All this suggests that in this model repositories and journals can happily coexist

2. Journal -> Repository

Author goes through publication process with OA/hybrid journal – with peer-review, revisions, editing and formatting etc. The article is published formally in the journal
After formal publication, the author/publisher deposits paper in repository, the repository processes the paper (e.g. restructuring, re-formatting), manages preservation, and makes the paper available

Some key points are in this model the copy in the repository is the published version (unlike above), and it includes management of preservation (which isn’t handled within model 1 at all – although it may be handled outside the model)

This is the model taken by Wellcome Trust/UKPMC particularly. Described by R Terry (2005). In this model the repository sets up the article for re-use and analysis, including ‘mining’ (i.e. machine parsing of the text to extract meaning or data)

UKPMC currently 600,000 hits a month, with 60,000 article downloads a month. Current content stands at 1.4 million full text articles, increasing at about 40,000 articles a month. Specifically this model is being driven by funder mandates.

David Prosser (SPARC) has suggested a development on this model, which is very similar, but rather than there being a ‘published’ copy and a ‘repository’ copy, the publisher publishes to the repository – so one copy, held in the repository, with the publisher linking to the paper in the repository. In this situation (as opposed to the one Stephen is about to come to), the process is still driven through a traditional ‘publisher’ workflow – it is just the final location of the article is different.

Before coming onto the final model Stephen is noting the functions of Scholarly Communication:

Registration (e.g. register first discovery against specific researchers)
Certification (quality)
Dissemination
Archiving

David Prosser notes a 5th function:

Reward

This is the idea that by publishing in a known journal, this is recognised in ways that reward the author (promotion, reputation etc.)

3. Repository -> Overlay Journal

The final model that Stephen is going to describe, the author interacts with both the publisher and the repository – rather than above, where really the author generally works with the publisher only.

Author writes paper, and submits as pre-print to the repository, which makes it available
an ‘Overlay journal’ selects the paper (made available by the repository), and subject it to peer-review
The author revises the paper on the basis of peer-review
Publisher edits and formats the paper (possibly?), and publisher/author deposits paper (post peer-review and post editing) in the repository which deals with management, preservation, access etc. as in model 2 described above

This model has been described by both JWT Smith (1997) and AP Smith (2000). The model involves the repository as the primary means of management and dissemination of content, where the publisher provides quality assurance etc.

Stephen now covering some of the ‘issues’ around these models:

Changing shape of the ‘journal’
- ‘Deconstructed journal’
- Journal as quality stamp
- Journal as brand
Changing shape of the ‘article’
- Single article in multiple journals (I can see this, but wonder how interested academics are in this – possibly generating multiple different versions as each journal applies different quality measures, peer-review for each journal could end up with contradictory revisions?)
- Version identification and management becomes key in this type of scenario – integrity assurances; standards; custom and practice for citations; version of record
Changing shape of ‘publication’
- Formal publication and dissemination
- Publication process

Overlay journal very ‘new’ – not very many examples of it yet.

Stephen says that in all of these models the ‘repository’ is key. This seems a bit self-fulfilling based on Stephen’s approach – he hasn’t considered any other model here – or possibly his definition of a repository is so encompassing any system that makes the article available becomes a repository. I’d argue that although each model relies of a ‘repository’ it wouldn’t have to bear much resemblance to what we have at the moment (especially if you accept, as model 1 does, that preservation may take place outside the model)

Stephen noting that all the models (to some extent) separate dissemination from quality assurance.

Stephen throwing out some questions:

What are business and funding models – for repositories as well as for publishers, and for research funders and institutions

We need to develop models that allow/enable/encourage(?) institutions to provide funds to author for OA publication in an author-pays mode.

There are still many issues relating to ‘content management’ – technical etc.

There are policy issues – funder requirement, institutional practices, and the REF (Research Evaluation Framework) which is coming, and will contain a citation analysis component using figures from the ‘traditional publishing process’.

Stephen strongly believes the REF risks stifling innovation by
measuring in a specific way, and pushing us back to reliance on traditional models (e.g. academics will want to publish in ISI indexed journals to get figures into ISI citation measures) – I suspect he is right…

Finally Stephen stressing the cultural issues around the way scholarly communication works. However, all the challenges that are there, challenge the traditional publishing model as well as being issues when trying to develop new models.

We need to move away from ‘paper-based’ models to harnessing the power the internet offers.

Question at the end from someone for ALPSP asking about how can it be efficient/economic for each institution to run a repository. Stephen says that repositories deliver more functions to institutions than just dissemination etc. (I think this is a bit of a weak answer – the point made is a good one, and it far from clear that we need each institution to run a repository to enable any of the models Stephen has described)

Overdue Ideas

Ideas linking Libraries, Computing, E-learning, and anything else that springs to mind.

Journals and Repositories: an evolving relationship

Leave a Reply