Spotlight on Names

A few people have been kind enough to test out my Composed bookmarklet and give some feedback (here on Google+ amongst other places). A couple of people identified composers on COPAC for which my bookmarklet didn’t produce any information, and when I checked this was because the underlying data I’m using, the MusicNet Codex, didn’t have a record for those examples.

This reinforced an idea that keeps coming back to me as I think about Library data and Linked Data – which is that we need good ways of capturing and then re-expressing in our data feedback from consumers/users of the data. In the case of the Composed bookmarklet it seems sensible to have a way to allow people to say at least:

  • “This record contains a composer [name or identifier] you don’t have in your data]”
  • “For this record the bookmarklet displays a composer not mentioned in the record”

This triggered some further thoughts that a bookmarklet could be a nice way of generally allowing those interested (librarians or others) to add information to the record – specifically structured information and identifiers for people (and perhaps some other entities). Then discussing the #discodev competition with Mathieu D’Aquin at the Open University today he mentioned the DBpedia ‘Spotlight’ tool which does entity extraction and gives back identifiers from DBpedia.

So how about a bookmark which:

  • Grabs the ‘people’ fields (100, 700, 600 – others?)
  • Passes contents to Spotlight and gets back possible DBPedia matches
  • Links to VIAF (think this is possible where VIAF has a Wikipedia URI) (possibly do this after decision made below)
  • Allows the user to confirm or reject the suggestions – if they confirm allows them to state a relationship as defined by MARC relators (available as Linked Data at http://id.loc.gov/vocabulary/relators.html)
  • Posts a triple expressing a link between the catalogue record, the relator, and DBPedia URI and/or VIAF URI

This could then be harvested back by libraries or others to get more expressive linked data relating bibliographic entities to people entities with a meaningful relationship. I haven’t looked at this in detail – but I don’t think it would be very difficult – my guess is just a few hours work.

I think this also starts to address another issue that always comes up when discussing libraries and linked data which is how might linked data start to become part of the metadata creation process in libraries – although since it relies on an existing record it doesn’t really get there – but if libraries are going to successfully exploit linked data we need to play around with interfaces that help us integrate linked data into our data as it is created.