LiFE^2 – Research Data Costs

This session not quite a case study, but is a description of the application of the LiFE model to research data preservation, by Neil Beagrie – which was used to produce the “Keeping Research Data Safe” report recently published.

They found that a number of factors had an impact on the costings from the model:

  • Costs of ‘published research’ repositories vs ‘research data’ repositories
  • Timing – costs c.333 euros for the creation of a batch 1000 records, but 10 years after creation it may cost 10,000 euros to ‘repair’ a batch of 1000 records with badly created metadata
  • Efficiency curve effects – we should see drop in costs as we move from start-up to operational activity
  • Economy of scale effects – 600% increase in acquisitions only give 325% increase in costs

Noting that a key finding is that the cost of Acquisitions and Ingest costs are high compared to archival storage and preservation costs. This seems to be because existing data services have decided to ‘take a hit’ upfront in making sure ingest and preservation issues are dealt with at the start of the process. I think this is a key outcome from the report, but based on the discussion today I don’t know what this tells us. I guess it is a capital vs ongoing cost question. If you’d asked me at the start of the day I’d have said that the model described was a reasonable one. However, after Paul Courant’s talk I wonder if this could result in dangerous inaction – if we can’t afford preservation, we won’t start collecting. The issue is that we can spread ongoing costs over a long period of time, so does dealing with a heavy upfront cost make sense?

Neil making a number of observations, but stressing that he does not regard the study as a the final word on costs.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.