Discovery API at The National Archives

Aleks Drozdov – enterprise architect for Discovery system at the National Archive (TNA). Going to speak about APIs and Data and how implemented in Discovery system at TNA.

My Introduction to APIs post is relevant to this talk.

API and Data

An API = Application Programming Interface. Web API – in web context the API is typically defined as a set of messages over HTTP. Response messages usually in XML or JSON format.

Data – explosion in amount of data available. Common to ‘mashup’ (combine) data from a number of sources. Also User contributed data.

Discovery Architecture

At the base has a ‘Object Data Store’ – NoSQL object oriented database (MongoDB)

Getting data into Discovery

Vast number of different formats feeding into Discovery:

XML, RDBMS, Text, Spreadsheets etc. Go through a complex/sophisticated data normalisation process. Then fed into MongoDb  – the Object Data Store

Discovery data structure

Discovery treats all things as ‘informational asset’  – you can build hierarchies by links between assets

Last number here is a unique and persistent identifier for an information asset [not clear what level this is

Discovery API examples

Documentation at

API endpoint at:

Just 6 calls supported (see

Can specify xml or json as format for response: or

Search:{page}/query= or{page}/query=

3o results per page


See documentation at for details of other calls.

Next steps

Now have Discovery Platform and getting people to use API – next plan to build a Data Import API – so that External data can be brought into Discovery platform. Also want to build User Participation API.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.