RIOXX v2.0

July 30, 2014August 7, 2014 · ostephens

Original concerns for RIOXX:

Primary:
- How to represent the funder
- How to represent the project/grant
Secondary:
- How to represent the persistent identifier of the item described
- Provisions of identifiers pointing to related data sets
- How to represent the terms of use for an item

Original principles:

Purpose driven – Focussed on satisfying RCUK reporting requirements
Simple (re-use DC, not CERIF)
Generic in scope (don’t tie down to specific types o output)
Interoperable – specifically with OpenAIRE
Developed openly – public consultation

Has anything changed with RIOXX 2 (mid-2014)?

Still purpose driven, but no encompassing HEFCE requirements as well
Slightly more sophisticated / complex but still quite simple
No longer ‘generic’ – explicit focus on publications
No longer seen as a temporary measure – positioned to support REF2020
Interoperability still key – and currently working on an OpenAIRE crosswalk

Other changes:

Implementing recommendations from the V4OA process
- controlled vocabs for rioxxterms:version, rioxxterms:apc
- Use of NISO’s Open Access Metadata and Indicators (licence_ref)

Current status:

version 2.0 beta was released for public consultation in June 2014
version 2.0 RC 1 has been compiled
accompanying guidelines are being written
XSD schema beeen developed
expect full release in late August/early September

Some specific elements:

dc:identifier
- identifies the open access item being described by the RIOXX metadata record, regardless of where it is
- recommended to identify the resource itself, not a splash page
- dc:identifier MUST be an HTTP URI
dc:relation and rioxxterms:version_of_record
- rioxxterms:version_of_record is an HTTP URI which is a persistent identifier for the published version of a resource
- will often be the HTTP URI form of a DOI
dc:relation
- option property pointing to other material (e.g. dataset)
dcterms:dateAccepted
- MUST be provided
- more precise than other dated events (‘published’ date very grey area)
rioxxterms:author & rioxxterms:contributor
- MUST be HTTP URIs – ORCID strongly recommended
- one rioxxterms:author per author
- rioxxterms:contributor is for parties that are not authors but credited with some contribution to publication
rioxxterms:project
- joins funder and projected in one, slightly more complex, proerty
- The use of funder IDs (DOIs in their HTTP URI form) from FundRef is recommended, but other ID schemes can be used and name can be used
license_ref
- adopted from NISO’s Open Access Metadata and Indicators
- takes an HTTP URI and a start date
- URI should identify a license
  - there is work under way to create a ‘white list’ of acceptable licenses
- embargoes can be expressed by using the ‘start date’ to show the date on which the license takes effect

Funding of development of RIOXX as application profile now at an end, but funding for further developments (e.g. s/w development for repositories etc.)

RIOXX is endorsed by both RCUK and HEFCE

Q: What about implementing RIOXX in CRIS systems?

A: Some work on mapping between CERIF and RIOXX, although not ongoing work. Technical description available for any one to implement.

A (Balviar): In terms of developing plugins for commercial products – not talked to commercial suppliers yet, but planning to look at what can be developed and conversations now starting.

A (James): Already got RIOXX terms in Pure feed at Edinburgh

Q: Can you clarify ‘first name author’ vs ‘corresponding author’ – do you intend the first named author to be the corresponding author?

A: Understand that the ‘common case’ is that the first name author is the corresponding author [at this point lots of disagreement from the floor on this point].

‘first name author’ seen as synonym for ‘lead author’

Q: Why is vocabulary for (?) not in line with REF vocabulary

A: HEFCE accepts wider range of outputs that ‘publications’, but RIOXX specifically focusses on publications – where most OA issues lie

Q: ‘Date accepted’ what about historic publications? Won’t have date of acceptance

A: RIOXX not designed for retrospective publication – going forward only. Not a general purpose bibliographic record

Comment from Peter Burnhill: RIOXX is not a cataloguing schema – it is a set of labels

Paul W emphasises – RIOXX is not a ‘record’ format – the systems outputting RIOXX will have much richer metadata already. There is no point in ‘subverting’ RIOXX for historical purposes – this isn’t its intended purpose.

Q: Can an author have multiple IDs in RIOXX

A: Not at the moment. Mapping between IDs for authors is a different problem space, not one that RIOXX tries to address

Comment from RCUK: Biggest problem is monitoring compliance with our policies – which RIOXX will help with a lot

Comments from floor: starting to see institutions issues ORCIDs for their researchers – could see multiple ORCIDs from a single person. Similarly with DOIs – upload a publishers PDF to Figshare you get a new DOI

Q: If you are producing RIOXX what about OpenAIRE

A: There is nothing in RIOXX that would stop it being OpenAIRE compliant – so RIOXX records can be transformed into OpenAIRE records (but no vice versa)

Automatic love

March 11, 2014 · ostephens

I think more people in libraries should learn scripting skills – that is how to write short computer programmes. The reason is simple – because it can help you do things quickly and easily that would otherwise be time consuming and dull. This is probably the main reason I started to use code and scripts in my work, and if you ever find yourself doing a job regularly that is time consuming and/or dull and thinking ‘there must be a better way to do this’ it may well be a good project for learning to code.

To give and example. I work on ‘Knowledgebase+’ (KB+) – a shared service for electronic resource management run by Jisc in the UK. KB+ holds details on a whole range of electronic journals and related information including details of organisations providing or using the resources.

I’ve just been passed the details of 79 new organisations to be added to the system. To create these normally would require a couple of pieces of information (including the name of the organisation) into a web form and click ‘submit’.

While not the worst nor the most time consuming job in the world, it seemed like something that could be made quicker and easier through a short piece of code. If I do this in a sensible way, next time there is a list of organisations to add to the system, I can just re-use the same code to do the job again.

Luckily I’d already been experimenting with automating some processes in KB+ so I had a head start, leaving me with just three things to do:

Write code to extract the organisation name from the list I’d been given
Find out how the ‘create organisation’ form in KB+ worked
Write code to replicate this process that could take the organisation name and other data as input, and create an organisation on KB+

I’d been given the original list as a spreadsheet, so I just exported the list of organisation names as a csv to make it easy to read programmatically, after that writing code that opened the file, read a line at a time and found the name was trivial:

CSV.foreach(orgfile, :headers => true, :header_converters => :symbol) do |row|
    org_name = row[:name]
end

The code to trigger the creation of the organisation in KB+ was a simple http ‘POST’ command (i.e. it is just a simple web form submission). The code I’d written previously essentially ‘faked’ a browser session and logged into KB+ (I did this using a code library called ‘mechanize’ which is specially designed for this type of thing), so it was simply a matter of finding the relevant URL and parameters for the ‘post’. I used the handy Firefox extension ‘Tamper Data’ which allows you to see (and adjust) ‘POST’ and ‘GET’ requests sent from your browser – which allowed me to see the relevant information.

The relevant details here are the URL at the top right of the form, and the list of ‘parameters’ on the right. Since I’d already got the code that dealt with authentication, the code to carryout this ‘post’ request looks like this

page = @magent.post(url, {
  "name" => org_name,
  "sector" => org_sector
  })
end

So – I’ve written less than 10 new lines of code and I’ve got everything I need to automate the creation of organisations in KB+ given a list in a CSV file.

Do you have any jobs that involve dull, repetitive tasks? Ever find yourself re-keying bits of data? Why not learn to code?

P.S. If you work on Windows, try looking at tools like MacroExpress or AutoHotKey, especially if ‘learning to code’ sounds too daunting/time-consuming

P.P.S. Perfection is the enemy of the good – try to avoid getting yourself into an XKCD ‘pass the salt’ situation

See the Connection? Toward a WYSIWYNC Literature

June 10, 2013June 10, 2013 · ostephens

Keynote from Ted Nelson

Talking about electronic literature for over 20 years. Felt alienated from the web because of ‘what it is not’.

Starting with the question – “what is literature”? For TN – a system of interconnected documents. But the web supports only ‘one way links’ – jumps into the unknown. Existing software does nothing for the writer to interact with this concept of ‘literacture’.

Constructs of books we have recreates the limitations of print – separate documents. Standard document formats – individual characters, scrambled with markup, encoded into a file. This thinking goes deep in the community – and TN contends this is why other ideas of how literature could exist are seen as impossible.

For the last 8-10 years, TN and colleagues working on a system that presents an interconnected literature (Xanadu Space). Two kinds of connection:

Links (connects things that are different, and are two way)
Transclusion (connects things that are the same)

TN illustrating using example of a programming working environment – where code, comments, bugs are transcluded into a single Integrated Work Environment.

We shouldn’t have ‘footnotes’ and ‘endnotes’ – they should be ‘on the side’.
Outlines should become tables of contents that go sideways into the document
Email quotation should be parallel – not ‘in line’

Vision is a parallel set of documents that can be see side-by-side.

History is parallel and connected – why do we not represent history as we write it – parallel coupled timelines and documents.

Challenge – how do you create this parallel set of connected documents? Each document needs to be addressable – so you can direct systems to ‘bring in text A from document B’. But challenges.

TN as a child was immersed in media. Dad was director for live TV – so TN got to see making television firsthand – his first experience was not just of consumption but as creation of TV. At college he produced musical, publication, film. Started designing interactive software.

How did we get here?

Douglas Engelbart – NLS
Ted Nelson – what became Xanadu (1960)
Hypertext Editing System – TN sorry he worked on it – it led to ‘one-way links’
Neil Larson – various projects (mentioned http://en.wikipedia.org/wiki/History_of_the_web_browser)
and today we have WWW as envisaged by Tim Berners-Lee – one way hyperlink

TN describing current realisation of the ‘translit’ approach – Xanadu. Several components:

Xanadoc – an ‘edit decision list format’ – generalisation of every quotation connected to it’s source
Xanalink – type, list of endsets (the things point at) – what to connected – exists independently of the doc?

What to do about changing documents? You copy & cache.

TN and colleagues almost ready to publish Xanadu specs for ‘xanadoc’ and ‘xanalink’ at http://xanadu.com/public/. Believes such an approach to literature can be published on the web, even though he dislikes the web for what it isn’t…

WYSIWYG – TN says only really applies to stuff you print out! TN aiming for ‘What you see is what you never could’ (do in print) – we need to throw off the chains of the printed document.

Joined Up Early Modern Diplomacy

June 10, 2013June 10, 2013 · ostephens

Another winner of a DM2E Open Humanities award being presented today by Robyn Adams (http://www.livesandletters.ac.uk/people/robynadams) from the Center for Editing Lives and Letters. Project looked at repurposing data from the letters of Thomas Bodley (responsible for the refurbishment of the library at the University Oxford – creating the Bodleian Library).

Bodley’s letters are held in archives around the world. The letters are full of references to places, people etc. The letters had been digitised and transcribed – using software called ‘Transcribers Workbench’ developed specifically to help with early modern English writing. In order to make the transcribed data more valuable and usable decided to encode people and places from the letters – unfunded work on limited resources. Complicated by obscure references and also sometimes errors in the letters (e.g. Bodley states ‘the eldest son is to be married’ when it turns out it was the youngest son – makes researching the person to which Bodley is referring difficult).

This work was done in the absence of any specific use case. Now Robyn is re-approaching the data encoded as a consumer – to see how they can look for connections in the data to gain new insights

The data is available on Github at https://github.com/livesandletters/bodley1

Semantic tagging for old mapsâ€¦ and other things

June 10, 2013 · ostephens

Dr Bernhard Haslhofer from University of Vienna giving details on their winning entry into the DM2E Open Humanities competition (http://dm2e.eu/open-humanities-award-winners-announced/)

MapHub – tools that allows you to map a historic map to the world as it is today. Also support commenting and semantic tagging.

All user contributed annotation is published via the Open Annotation API – so MapHub both consumes and produces open data.

Focussing today on Semantic Tagging in MapHub. While a user enters a full-text comment, MapHub analyses the text and tries to identify matching concepts in Wikipedia (DBPedia), and suggest them. User can click on suggested tags to accept it (or click again to reject). They carried out some research and found there was no difference between user behaviour in terms of number of tags added whether they used a ‘semantic tagging’ approach (linking each tag to a web resource) and ‘label tagging’ (tags operate as text strings).

Having found this successful, Bernhard would like to see the same concept applied to other resources – so planning to extract semantic tagging part of MapHub and develop a plugin for Annotorious. Also going to extend beyond the use of Geonames and Wikipedia – e.g. vocabularies expressed in SKOS. Aim to do this by September 2013.

Contemporaneous part two

April 2, 2013April 2, 2013 · ostephens

Following on from my previous post about BNB and SPARQL in this post I’m going to describe briefly building a Chrome browser extension that uses the SPARQL query described in that post – which given a VIAF URI for an author tries to find authors with the same birth year (i.e. contemporaries of the given author).

Why this particular query? I like it because it exposes data created and stored by libraries that wouldn’t normally be easy to query – the ‘birth year’ for people is usually treated as a field for display, but not for querying. The author dates are also interesting in that they give a range for the date a book was actually written rather than published which is the date that is used in most library catalogue searching.

The other reason for choosing this was that it nicely demonstrates how using ‘authoritative’ URIs for things such as people makes the process of bringing together data across multiple sources much easier. Of course whether a URI is ‘authoritative’ is a pretty subjective judgement – based on things like how much trust you have in the issuing body, how widely it is used across multiple sources, how useful it is. In this case I’m treating VIAF URIs as ‘authoritative’ in that I trust them to be around for a while, and they are already integrated into some external web resources – notably Wikipedia.

The plan was to create something that would work in a browser – from a page with a VIAF URI in it (with the main focus being Wikipedia pages), allow the user to find a list of ‘contemporaries’ for the person based on BNB data. I could have done this with a bookmarklet (similar to other projects I’ve done), but a recent conversation with @orangeaurochs on Twitter had put me in mind of writing a browser extension/plugin instead – and especially in this case where a bookmarklet would require the user to already know there was a VIAF URI in the page – it seemed to make sense.

I decided to write a Chrome extension – on a vague notion it probably had a larger installed base of any browser except Internet Explorer – but then later checking Wikipedia stats on browser use showed that Chrome was the most used on Wikipedia at the moment anyway – which is my main use case.

I started to look at the Chrome extension documentation. The ‘Getting Started’ tutorial got me up and running pretty quickly, and soon I had an extension running that worked pretty much like a bookmarklet and displayed a list of names from BNB based on a hardcoded VIAF URI. The extensions are basically a set of Javascript files (with some html/css for display), so if you are familiar with Javascript then once you’ve understood the specific chrome API you should find building an extension quite straightforward.

I then started to look at how I could grab a VIAF URI from the current page in the browser, and only show the extension action when one was found. The documentation suggested this is best handled using the ‘pageAction’ call. A couple of examples (Mappy (zip file with source code) and Page Action by content (zip file with source code)) and the documentation got me started on this.

Initially I struggled to understand the way different parts of the extension communicate with each other – partly because the code examples above don’t use the simplest (or most up to date) approaches (in general there seems to be an issue with the sample extensions sometimes using deprecated approaches). However the ‘Messaging’ documentation is much clearer and up to date.

The other challenge is parsing the XML returned from the SPARQL query – this would be much easier if I used some additional javascript libraries – but I didn’t really want to add a lot of baggage/dependencies to the extension – although I guess many extensions must include libraries like jQuery to simplify specific tasks. While writing this I’ve realised that the BNB SPARQL endpoint supports content negotiation, so it is possible to specify JSON as a response format (using Accept: sparql-results+json as per SPARQL 1.1 specification) – which would probably be simpler and faster – I suspect I’ll re-write shortly to do exactly this.

The result so far is a Chrome extension that displays an icon in the address bar when it detects a VIAF URI in the current page. The extension then tries to retrieve results from the BNB. At the moment failure (which can occur for a variety of reasons) just means a blank display. The speed of the extension leaves something to be desired – which means that sometimes you have to wait quite a while for results to display – which can look like ‘failure’ – I need to add something to show ‘working’ status and a definite message on ‘failure’ for whatever reason.

A working example looks like this:

Each name in the list links to the BNB URI for the person (which results in a readable HTML display in a browser, but often not a huge amount of data). It might be better to link to something else, but I’m not sure what. I could also display more information in the popup – I don’t think the overhead of retrieving additional basic information from the BNB would be that high. I could also do with just generally prettying up the display and putting some information at the top about what is actually being displayed and the common ‘year of birth’ (this latter would be nice as it would allow easy comparison of the BNB data to any date of birth in Wikipedia.

As mentioned, the extension looks for VIAF URIs in the page – so it works with other sources which do this – like WorldCat:

While not doing anything incredibly complicated, I think that it gives one example which starts to answer the question “What to do with Linked Data?” which I proposed and discussed in a previous post, with particular reference to the inclusion of schema.org markup in WorldCat.

You can download the extension ready for installation, or look at/copy the source code from https://github.com/ostephens/contemporaneous

Contemporaneous part one

April 1, 2013April 2, 2013 · ostephens

I recently did a couple of workshops for the British Library about data on the web. As part of these workshops I did some work with the the BNB data using both the API and the SPARQL endpoint. Having a look and play with the data got me thinking about possible uses. One of the interesting things about using the SPARQL endpoint directly in place of the API is that you have a huge amount of flexibility about the data you can extract, and the way SPARQL works lets you do in a single query something that might take repeated calls to an API.

So starting with a query like:

SELECT *
WHERE {
<http://bnb.data.bl.uk/id/person/Bront%C3%ABCharlotte1816-1855> ?p ?o
}

This query finds triples about “Charlotte BrontÃ«”. The next query does the same thing, but uses the fact that the BNB makes (where possible) ‘sameAs’ statements about BNB URIs to the equivalent VIAF URIs:

PREFIX owl:  <http://www.w3.org/2002/07/owl#>
SELECT ?p ?o
WHERE {
?person owl:sameAs <http://viaf.org/viaf/71388025> .
?person ?p ?o
}

This query first finds the BNB Resource which is ‘sameAs’ the VIAF URI for Charlotte BrontÃ« (which is http://bnb.data.bl.uk/id/person/Bront%C3%ABCharlotte1816-1855) – this is done by:

?person owl:sameAs <http://viaf.org/viaf/71388025>

The result of this query is one (or potentially more than one, although not in this particular case) URI, which are then used in the next part of the query:

?person ?p ?o

In this case, the query is slightly wider in that it is possible that there is more than one BNB resource identified as being the ‘sameAs’ the VIAF URI for Charlotte BrontÃ« (although in actual fact there isn’t in this case).

Taking the query a bit further, we can find the date of birth for Charlotte BrontÃ«:

PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX bio:  <http://purl.org/vocab/bio/0.1/>
SELECT ?dob
WHERE {
?person owl:sameAs <http://viaf.org/viaf/71388025> .
?person bio:event ?event .
?event rdf:type bio:Birth .
?event bio:date ?dob
}

The ‘prefix’ statements are just to setup a shorthand for the query – rather than having to type out the whole URI each time I can use the specified ‘prefix’ as an equivalent to the full URI. That is:

PREFIX bio:  <http://purl.org/vocab/bio/0.1/>
?person bio:event ?event

is equivalent to

?person <http://purl.org/vocab/bio/0.1/event> ?event

Having got to this stage – the year of birth based on a VIAF URI – we can use this to extend the query to find other people in BNB with the same birth year – the eventual query being:

PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX bio:  <http://purl.org/vocab/bio/0.1/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?dob ?name
WHERE {
?persona owl:sameAs <http://viaf.org/viaf/71388025> .
?persona bio:event ?eventa .
?eventa rdf:type bio:Birth .
?eventa bio:date ?dob .
?eventb bio:date ?dob . 
?eventb rdf:type bio:Birth .
?personb bio:event ?eventb .
?personb foaf:name ?name
}

I have to admit I’m not sure if this is the most efficient way of getting the result I want, but it does work – as you can see from the results. What is great about this query is that the only input is the VIAF URI for an author. We can substitute the one used here for any VIAF URI to find people born in the same year as the identified person (as long as they are in the BNB).

Since VIAF URIs are now included in many relevant Wikipedia articles, I thought it might be fun to build a browser extension that would display a list of ‘contemporaries’ for using the BNB data – partly to show how the use of common identifiers can make these things just fit together, partly to try building a browser extension and partly because I think it is a nice demonstration of the potential uses of library data which we have in libraries but often don’t exploit (like the years of birth/death for people).

But since this post has gone on long enough I’ll do a follow up post on building the extension – but if you are interested the code is available at https://github.com/ostephens/contemporaneous

Triples to Trenches – Linked Data in Archives

March 7, 2013March 10, 2013 · ostephens

Lianne Smith from King’s College London Archives

Archives have records/papers from Senior Military personnel (I think I got that right?) [update 10th March 2013: thanks to David Underdown in the comments below for clarifying that Lianne was referring to the Liddell Hart Centre for Military Archives]

See Archives at KCL being innovative – and latest project is in Linked Data space.

Lianne is an archivist, not a technical specialist. 18 months ago, hadn’t heard of linked data – so a beginners view.

Context: Preparation for centenary of the start of the First World War – other institutions also doing work:

www.1914.org (Imperial War Museums)
www.europeana1914-1918.eu (Europeana)
www.jiscww1discovery.net (JISC funded project based at Mimas)

KCL had already contributed to the latter of these.

Report had highlighted the problem of locating resources and searching across information sources – even within institutions. Report particularly noted that if content wasn’t surfaced in Google searches it was a lot less ‘discoverable’. Also lack of clear vocabulary for WWI materials – so wanted to establish a standard approach building on existing thesauri etc.

Also Triples to Trenches built on previous project funded by JISC – particularly “Open Metadata Pathway” and “Step Change” projects – creating easy to use tools which would enable archivists to publish linked open data within normal workflows and without specialist knowledge.

Aims of Trenches to Triples included:

Creation of API to share data created using Alicat tool (output of Step Change project)
Adaption of catalogu front end for visualistation of linked data entities
Creation of Linked Data WWI vocabulary of peronal names, corporate name, place and subject terms – available for reuse by archives sector
Also act as a case study of Linked Data in archives

Within Alicat Places are based on Google Maps data – which ‘makes it simpler’ [although the problem with using contemporary maps when the world changes…] [also note that looking at a Place ‘record’ on data.aim25.ac.uk there is a GeoNames link – wonder if this is where the data actually comes from? E.g. http://data.aim25.ac.uk/id/place/scapafloworkneyscotland]

Outcomes of project:

Creation of WWI dataset and integration into AIM25-UKAT
- Lesons learned in the creation of the dataset concerning identification of the level of granularity required and the amount of staff time which needs to be invested in preparation
- different users have very different requirements in terms of granularity of data
- Team included WWI specialist academic – identified a good resource for battles – could reuse existing data
- Different users also use variation of terms in their research
More work on the front-end (User facing UI) presentation of additional data
- Being able to integrate things like maps into UI is great
- Need to work more on what sort of information you want to communicate about entities – especially things like Names – unlike location where a map is obvious addition
Need to increase the availability of resources as linked data
need to increase understanding and training in the archives sector
- This approach is hugely reliant on understanding of the data – need archivists involved
need ongoing collaboration from the LODLAM community in agreeing standards for Linked Data adoption

Discovery API at The National Archives

March 7, 2013 · ostephens

Aleks Drozdov – enterprise architect for Discovery system at the National Archive (TNA). Going to speak about APIs and Data and how implemented in Discovery system at TNA.

My Introduction to APIs post is relevant to this talk.

API and Data

An API = Application Programming Interface. Web API – in web context the API is typically defined as a set of messages over HTTP. Response messages usually in XML or JSON format.

Data – explosion in amount of data available. Common to ‘mashup’ (combine) data from a number of sources. Also User contributed data.

Discovery Architecture

At the base has a ‘Object Data Store’ – NoSQL object oriented database (MongoDB)

Getting data into Discovery

Vast number of different formats feeding into Discovery:

XML, RDBMS, Text, Spreadsheets etc. Go through a complex/sophisticated data normalisation process. Then fed into MongoDb – the Object Data Store

Discovery data structure

Discovery treats all things as ‘informational asset’ – you can build hierarchies by links between assets

http://discovery.nationaarchives.gov.uk/SearchUI/details?Uri=C10127419

Last number here is a unique and persistent identifier for an information asset [not clear what level this is

Discovery API examples

Documentation at http://discovery.nationalarchives.gov.uk/SearchUI/api.htm

API endpoint at: http://discovery.nationalarchives.gov.uk/DiscoveryAPI

Just 6 calls supported (see http://discovery.nationalarchives.gov.uk/SearchUI/api.htm)

Can specify xml or json as format for response: http://discovery.nationalarchives.gov.uk/DiscoveryAPI/xml/ or http://discovery.nationalarchives.gov.uk/DiscoveryAPI/json

Search: http://discovery.nationalarchives.gov.uk/DiscoveryAPI/xml/search/{page}/query= or http://discovery.nationalarchives.gov.uk/DiscoveryAPI/json/search/{page}/query=

3o results per page

e.g. http://discovery.nationalarchives.gov.uk/DiscoveryAPI/json/search/1/query=C%20203

See documentation at http://discovery.nationalarchives.gov.uk/SearchUI/api.htm for details of other calls.

Next steps

Now have Discovery Platform and getting people to use API – next plan to build a Data Import API – so that External data can be brought into Discovery platform. Also want to build User Participation API.

Interoperability in Archival descriptions

March 7, 2013 · ostephens

Jenny Bunn from UCL starting with a summary of history of archival description standards – from USMARC AMC (1977) to ISAD(G) (1st edition formally published 1994).

Meanwhile WGSAD in the US published ‘Standards for Archival Description: A Handbook” – also in 1994. Contains a wide variety of standards relevant to archives – from technical standards to Chicago Manual of Style.

EAD has its origin in encoding the Finding Aid – not to model archive data per se. EAD v1.0 released 1998

Also a mention of ISO23081 – metadata for Records (records management)

Bunn suggests that ISAD(G) is designed for information exchange – not for Archival Description. Specifically ISAD(G) doesn’t discuss the authenticity of records. At this point (says Bunn) ISAD(G) more a straight-jacket than enabler.

Call to action – move to ‘meaning’ vs information exchange in standards.

Point from Jane Stevenson that ISAD(G) not that great for information exchange! But Jenny makes point that as a schema it could serve the purpose – lack of content standard is a barrier to information exchange even within ISAD(G)

Overdue Ideas

Ideas linking Libraries, Computing, E-learning, and anything else that springs to mind.