Web Services

This is one of the sessions I’ve been looking forward to most. Firstly Mark Ellingsen is talking about web services, and their relation to library technologies and Ex Libris products.

Web services are technologies which enable inter-application communication. They are based around the exchange of information in XML format, using standard protocols (usually http)

One key component of web services is XSD – the XML Schema Definition language. This provides structure, validation rules, data type constraints and inter-element relationships – i.e. they define what the information can be put into the xml document, and how the information is structured.

There are two approaches to constructing web services – one based around SOAP, and the other based on REST.

The first approach uses three components:

  • SOAP (Simple Object Access Protocol) is a standard messaging format
  • WDSL is the Web Services Description Services Langugage – it defines web services
  • UDDI – Universal Description, Discovery and Integration – allows for the registration of web services to make them finable and usable by other applications.

Second architectural style is REST – Representational State Transfer – in this style the URI is a representation of a resource, the clien then moves (transfers) from stat to stat based on a transition from one URI to another. HTTP methods like GET and PUT are used to changes states.

SOAP is supported by many major players – including IBM, Sun and Microsoft. REST is easier to implement but is restricted to http transport.

In the library world – Fedora, Amazon and ZING all support SOAP and REST (or REST like) approaches.

Also in this area is WSRP – Web Services for Remote Portlets – which allows the integration of a portlet (or ‘channel’) into a portal environment. This is aimed at ‘plug and play’ approach to portals, which are currently often built by hand. For example, we supply Aleph patron data in the RHUL portal – but it is all specially coded, and had to be re-coded when we upgraded from one version of Aleph to another.

Aleph X-Services are a step in the right direction, but are not currently web services, which makes them difficult to utilise in a practical situation.

However, you need to approach the problem from both sides – web services don’t make a useful environment – they just enable integration and communication between systems. We need to understand what we are trying to do – what systems need to talk to each other, and what they need to be able to say – then we need web services to support these interactions.

Mark is now mentioning the DLF Service Framework – something I obviously need to look at. The Working Group report indicates that we need to understand what services we need to provide, and that the abscence of common models stop us developing useful services.

The DLF defines a service as ‘a functional componenet which it is useful to talk about as a unit’. Mark adds to this, but I didn’t get down his additions in time – slow down Mark!

The DLF asked ‘At what level of granularity and aggregation should service be designed?’ – that is to say, should we look at a ‘library service’ or a ‘loan renewal’ service – what level should we look at. I suspect that we need to group services together somewhere between these levels – so ‘library patron services’, ‘search and retrieval services’. I also suspect that librarians (myself included) tend to be too granular when we think about the services we supply.

Going onto Mark’s next slide – before we can define the services, we need to think about our business processes – until we understand this, we cannot define the services we need, and then develop APIs and build useful applications.

Finally Mark is mentioning VIEWS – Vendor Initiative for Enabling Web Services – this is a vendor initiative. However Mark sees them focussing on the wrong things initially – Gateway searching and Authentication and Authorisation – these are things that are already being worked on by other organisations. It would be more interesting, Mark suggests, to focus on things such as financial transactions and interactions with e-learning systems.

So – Mark’s tasks for the library community:

We need to understand library processes
Need to understand how they interface with other processes with the organisation

Mark says he hasn’t ever seen a diagram which shows the interactions between the library business process and the rest of the institution.

and tasks for Ex Libris:

Integrate web services into the x-service API, and provice a more functionally rich x-services environment.

Wi-Fi problems

Unfortunately the wi-fi connection is currently mis-behaving – what a pain. Seems to be a problem with plug-in and wireless access interfereing with each other. I can’t believe that this is causing a problem. I can run a perfectly good wireless and wired network in my own house, but then I come to a conference and the BL, pay my 35 pounds and then it doesn’t work.

Unfortunately it is one of those situations where the wireless network is provided by a commercial company (Leith’s), and the wired network is provided by the BL IT dept – so very difficult to get to the bottom of what the problem is.

I should say that the BL IT staff have been very helpful, as have the person on the wireless network helpline. However – this doesn’t get round the fact that I’ve paid for a 30 day pass on the wireless network, and it doesn’t work! I also can’t get an answer about who is going to refund my money (if it comes to that – I hope it doesn’t).

I’m also relying on being connected to be able to get through other work, and keep in touch with home base while I’m here – in some ways it is this connection which allows me to attend a conference for a week without compromising my work.

You may gather – I’m a bit annoyed!

Wi-Fi problems

Unfortunately the wi-fi connection is currently mis-behaving – what a pain. Seems to be a problem with plug-in and wireless access interfereing with each other. I can’t believe that this is causing a problem. I can run a perfectly good wireless and wired network in my own house, but then I come to a conference and the BL, pay my 35 pounds and then it doesn’t work.

Unfortunately it is one of those situations where the wireless network is provided by a commercial company (Leith’s), and the wired network is provided by the BL IT dept – so very difficult to get to the bottom of what the problem is.

I should say that the BL IT staff have been very helpful, as have the person on the wireless network helpline. However – this doesn’t get round the fact that I’ve paid for a 30 day pass on the wireless network, and it doesn’t work! I also can’t get an answer about who is going to refund my money (if it comes to that – I hope it doesn’t).

I’m also relying on being connected to be able to get through other work, and keep in touch with home base while I’m here – in some ways it is this connection which allows me to attend a conference for a week without compromising my work.

You may gather – I’m a bit annoyed!

Scholarly Communication – part 2

Now the second speaker – Reagan Moore from San Diego Supercomputer Centre (SDSC) – is talking about the implications for institutions managing Scholarly Communications. The first point being the huge amount of data we are talking about. At SDSC we are talking about (overall) over 500 terabytes of data, and in the order of 68 million files!

To put this in context, it is estimated that all the books in the Library of Congress consist of only 20 terabytes of data (in text) – so the amount of information is absolutely huge. We need to quickly realise that we are dealing with information on a new scale – and start to get to grips with the problems and solutions.

At SDSC they have a ‘data grid’ which is essentially a middle layer sitting between the data access methods (essentially the user (or machine) interfaces to the data), and the data storage – which may be a variety of formats in a variety of systems.

All of this stuff is better illustrated by diagrams!

Essentially, what they call the ‘Data Grid’ allows the abstraction of user interface from the storage. In terms of the ‘digital library’, they have done an integration with DSpace which allows the management of information spread across different file systems in different locations.

Talking about the National Science Digital Library (which uses Fedora), Reagan is saying that they needed to crawl and harvest material from web sites, and found that the majority of URLs being entered into their system became invalid after 3 months. So they have been looking at how to build a digital archive of this material. However, I think we need to ask some hard questions about what we preserve. Should we not ask why the material is disappearing so quickly from the original source?

Perhaps more interestingly huge amounts of data is being produced by research centres (e.g. Astronomy data, Seismic studies) – and how we store, and provide access to all this.

An interesting question is what the role of the library is in the world of data grids and huge amounts of data. Reagan is indicating that the library is still seen as both the ‘indexer’ but perhaps more importantly as the provider of the user interface to these collections. (phew – that’s a relief – no chance of librarians being out of a job then). However, perhaps we need to look at the skill sets needed. Paul Ayris is suggesting that we need to look to Archivists and Records Managers for skills – which links back to my question above about what we preserve and what we don’t. Steven Pinfield, for the University of Nottingham, who is chairing the session has also prompted Ex Libris to think about the role of the library system software supplier to a world in which the ‘data grid’ is a key part of the information landscape.

Some interesting questions about the approach of distributed data, and also the question of access to information. The demands of keeping certain data confidential (to preserve IPR) means controlling access to every ‘object’ in a data grid – so we are talking about controlling access to 68 million individual files. However, there is also a move to making the data available to the wider community – the raw data should be available, the analysis is where the ‘value’ is added

Scholarly Communication

This session consists of two presentations on Scholarly Communication. The first is by Paul Ayris from UCL. Interestingly he has said that UCL is now asserting the IPR of work produced at UCL belongs to the Academic and student authors. I find this suprising, as I would have thought UCL would want to assert the rights of the institution (as employer) to the IPR on work produced. This may not apply to student authors (the status of students with UK Universities is often unclear in terms ’employment’ or ‘belonging’). However, for staff, I would have thought it was clear they do research under the auspicies of the employer – which is the University.

Some interesting comments from Ian Gibson, who chaired the Parliamentary committee in the UK which looked at scholarly communication. His feeling was that the outcomes of the committee meetings included recommendations that funding bodies should mandate the deposit of research in Open Access repositories, and also that their should be more clarity on ‘quality’ standards – with the possibility of some ‘kitemarking’ to indicate quality (the kitemark is a British Standards logo which indicates that a product meets certain quality standards)

Paul is now talking about the outcomes of a UK based project looking at the use of DigiTool. The most pertinent part of this project was the UCL E-Theses trial. This unfortunately only looked at DigiTool 2.4, which is a very different system to DigiTool v3.x

A separate JISC funded project on digital curation based around a partnership between UCL, the British Library and Ex Libris is due to report in 2006 (http://www.ucl.ac.uk/ls/lifeproject/index.shtml)

Paul is now talking about the outcome of a survey of taught students in 2000, where the key thing that they thought would improve library services was more core texts available. Paul has highlighted that no-one asked for more e-resources or mentioned e-learning. However, I’m not convinced that this is very meaningful – you would always see taught students asking for more texts – this is very much (in my view) based around what they are told to read. If they are told to use e-resources or e-learning by their tutors, this will become a key issue for them.

Paul is moving onto ‘Course Reading’ approaches – highlighting the lack of provision from Ex Libris in this key area, and the fact that several UK universities are using the open source software from University of Loughborough.

Paul has now put up an interesting diagram done by Margaret Flett from UCL describing information systems and access at UCL. Hopefully this will be made avaialable online, as I would like to compare it to the RHUL picture. There are some things I would question (e.g. the labelling of MetaLib as an ‘optional’ gateway), but a very interesting picture. One important point is the way that the ‘VLE’ has become a key way for students to access information at UCL.

Paul is now talking about how Ex Libris s/w interacts with Open Source software and Open Access content. Paul is questioning the fact that MetaIndex – the Ex Libris add-on to MetaLib to harvest and search OAI compliant repositories. However, I disagree that this is a problem (although I won’t object if we get it free!) – if the product needs developing and supporting, why should it be ‘free’. I’m not even sure what Paul is suggesting here – that Ex Libris should develop this s/w and give it away? This seems redundant – there is already free software for OAI harvesting – what is the benefit to Ex L to develop this again and give it away?

However, I do agree that Ex Libris could contribute our understanding of what we need to provide here. I’m not sure we quite understand what our users need in terms of access to information stored in OAI environments. Perhaps the point is that Ex Libris need to come up with a more compelling offering in this area, which then is worth paying for. An example might be thinking about how Ex Libris s/w might help users navigate the different versions of a paper, perhaps having discovered the Open Access pre-print, and wanting to see the published version. I think many of the building blocks are in place already (MetaLib, SFX, DigiTool) but we need to see how these work together deliver the correct user experience – and of course, we need to know what the desired user experience is!

Interfaces and IPO

A couple of interesting points in the opening presentation from Matti Shem-Tov (CEO of Ex Libris) – one is an initiative to develop the various ‘public’ interfaces to the Ex Libris products (Aleph OPAC, SFX, MetaLib, XML services) – I’m looking forward to hereing more about this. Secondly, Ex Libris is going ‘public’ as a company. They are expecting to make an initial offering in mid-October. This is to raise funds for development, and to fund expansion of the company – especially in the Asian market. Also to fund the possible acquisition of complementary companies and technologies – which is very interesting …

For all those eager investors out there, I’m afraid that the offereing is being made on the AIM Stock Exchange – which means that the shares will not go to individual investors, but financial institutions

Interfaces and IPO

A couple of interesting points in the opening presentation from Matti Shem-Tov (CEO of Ex Libris) – one is an initiative to develop the various ‘public’ interfaces to the Ex Libris products (Aleph OPAC, SFX, MetaLib, XML services) – I’m looking forward to hereing more about this. Secondly, Ex Libris is going ‘public’ as a company. They are expecting to make an initial offering in mid-October. This is to raise funds for development, and to fund expansion of the company – especially in the Asian market. Also to fund the possible acquisition of complementary companies and technologies – which is very interesting …

For all those eager investors out there, I’m afraid that the offereing is being made on the AIM Stock Exchange – which means that the shares will not go to individual investors, but financial institutions

ICAU and SMUG 2005

The annual meeting of the main user groups for the library software we uses at RHUL – Aleph, SFX and MetaLib. This year it is a bit special because it is at the British Library in London, and I’ve been helping organise it. I have to admit that I feel a bit of a fraud, as other people have done all the hard work, but it is nice to see it all coming together.

Anyway, I’m very much looking forward to the meeting – both in terms of meeting other users (always interesting and fun) and also because some of the presentations promise to be very interesting – some stuff on web services and library systems, and a presentation on the new Verde product from Ex Libris, which is an electronic resource management system (and I contributed to discussions on the design and functionality as part of a user focus group)

It is also a chance to catchup with friends made through similar meetings over the last 5 years since we started using the Aleph library management system at RHUL