Send in the clouds

Cloud computing seems to be the buzzword of the moment, and there is currently a lot of media coverage – especially in the light of the recent Microsoft announcement of their own take on the cloud with Azure. I’ve also been following some less high profile, but nonetheless thought provoking, discussions about other aspects of the cloud such as the ‘data cloud’ on blogs and twitter.

Why cloud? At some point it became usual to represent the network as a ‘cloud’ in network and computing architecture diagrams like this:

(courtesy of stephthegeek AttributionNoncommercialShare Alike Some rights reserved

The cloud represents the complexity of the internet here, but also says it is quite simple from the network point of view – stuff goes in one end, and comes out the other.

The concept of ‘cloud computing’ is that you can have services that sit on the Internet (i.e. ‘in the cloud’) and use them in the same way – push stuff in, get stuff out, don’t really care how it happens.

Examples of ‘cloud computing’ that are often cited are:

  • Google Docs
  • Amazon S3
  • Amazon EC2

The first is a suite of office tools that exist only online – you don’t download anything to your PC, and you interact with them via your browser.

The second two are services from Amazon – the first is a storage service, where you can use hard disk space on Amazon servers to store stuff, and the second is service which allows you to run virtual servers, on demand, on Amazon hardware.

This week Microsoft announced it’s own take on the cloud in the form of Azure – which will provide a way of synchronising between your desktop, your mobile and ‘the cloud’.

In a recent post to the ZDNet Semantic Web blog, Paul Miller goes on to talk about the possibility of a ‘data cloud’ – linking it to the idea behind the ‘semantic web’ – that is creating a web of data, all inter-linked.

Paul Walk argued (and I’m inclined to agree), that you couldn’t talk about data in the same way as computing power – as you cared about data in a different way to your computing power – you would never accept just ‘any old data’. In a comment on this post Chris Rusbridge suggests that what Paul is referring to is the provenance of the data – again I’m inclined to agree.

However, I’d say that actually this is as true of obtaining computing power as it is data. Although, as Paul Walk notes, I may not care about the particular hardware that I’m utilising, or where it lives, I do care that I’m being offered a robust service – and so I don’t want just any old computing power – I want  the good stuff! I personally use the Amazon S3 service to backup data – because I trust that Amazon is going to be pretty reliable – I wouldn’t trust the same data to some bloke running a ‘cloud computing’ service from his garage.

The difference for me between the Internet as a ‘cloud’ and the idea of ‘cloud computing’ is that when I transmit data over the Internet as a network I’m trusting not in a single provider, but essentially in a technical protocol and infrastructure to get stuff from one place to another – and although one part of that journey is governed by someone I’ve chosen (my ISP) most of it isn’t. When I choose a ‘cloud computing’ service I trust a ‘brand’ that provides the service – admittedly I don’t ask questions about how they provide the service (do they subcontract? how would I know?), but I not just throwing a task at a generalised technical solution and saying ‘store this’ or ‘process that’.

I would argue that peer-to-peer networks are much closer to the idea of ‘cloud’ computing than Amazon’s or Google’s services. If I upload something to a peer-to-peer network, then it is potentially going to be stored in lots of places, and I won’t know where it is. For some data this might work (stuff that I really want to share), but for others (stuff I want to keep but perhaps not share) it isn’t.

Skype also uses peer-to-peer technology to route Skype calls – and again, I would argue that this is much closer to a situation where you really “don’t care” where the processing takes place – as long as your call holds up.

So, I think that what is being called ‘cloud computing’ is actually SaaS – Software (or Storage I guess) as a Service. SaaS is a model where you obtain access to software that is hosted elsewhere – so typically via the Internet. When I use Google Docs or Amazon S3 this is really what I’m doing.

Several Library system vendors offer – although without incredibly enthusiastic uptake in the academic library sector (see the post from Dave Pattern and various comments at http://www.daveyp.com/blog/archives/303). I have in the past been a bit sceptical about the idea of SaaS, but as I note in my comment on the post above, I’m now much more convinced.

I think that Paul Miller’s arguments make more sense in this context – DaaS (Data as a Service) – that is getting your data that is hosted somewhere else makes sense. However, Paul is arguing for something a bit more than this – data that is hosted in a way that makes in it accessible and linkable – and this is something that I think libraries need to get to grips with – there is a lot of talk of ‘data silos’ and how libraries are guilty of perpetuating this – we need to break out of this paradigm. I was very depressed to see a comment on an email list this week that said ""There is something to be said for the library's catalogue being self contained and inhouse" – I think this is an attitude we have to change – although I understand the arguments about reliability (e.g. in the face of network failure) we can overcome these problems without having systems that are ‘self contained’ and if we are to have library data as part of the cloud, we need to.

5 Years, 2 days

That’s how long I’ve been blogging here. Looking back I’m relatively surprised that I’ve actually posted with reasonable consistency. If I’m at a conference or other event, I blog a lot – trying to capture as much from the sessions as I can, and otherwise I tend to post a couple of times a month.

I thought a Wordle of my blog would be a good way of celebrating 5 years, so here it is – this is based on all entries, comments and pingbacks from the last 5 years (but not this entry)

Mashed Library ’08 – Register Now

You know you want to.

Registration for Mashed Library '08 is now open at http://www.ukoln.ac.uk/events/mashed-library-2008/

There is no charge for the day, thanks to my employer (Imperial College London), sponsorship from UKOLN (http://www.ukoln.ac.uk), and the donation of time and space from Birkbeck College London (esp. thanks to David Flanders for this).

In the first instance places are limited to 25 people. If demand proves
sufficient we'll look at whether this can be increased. Registration
closes on 14th November. Hope you can make it.

The date and venue for the event are as follows:

Mashed Library '08 (see how optimistic I am) will be at Birkbeck College in London on 27th November:

http://www.bbk.ac.uk/maps

As I've described previously, the idea is to have a reasonably informal event at which we try to do
interesting stuff with library technology and/or data.

The day will start at 10 and finish at 5, and there is a notional structure for the day posted in the Structure forum at the ning site (http://mashedlibrary.ning.com), and replicated below. If you can't use the ning site for any reason, feel free to leave a comment here, or drop me a line via email or twitter.

Notional Structure

10am Start
10-11 Dummies guide to … (some short presentations on some of the
tech/tools that might be of use during the day – post requests/suggestions below)
11-4 Mashup – work in teams or individually to do interesting stuff
4-5 Round up of mashups and close

I had initially thought that we could have a 'pitch' session where
people pitched ideas for development, but in order to save time on the
day, I suggest that any suggestions for projects for the day are
discussed beforehand – again you can leave comments here, or use the discussion boards at http://mashedlibrary.ning.com

I've had some interest shown in remote participation,
and I'm happy to see what we can do to support this, although I'm not
quite sure what form this participation should take – if you are
interested in this, please post at http://mashedlibrary.ning.com/forum/topic/show?id=2186716%3ATopic%3A127, or leave a comment below, and I'll see what I can do (not promising anything at this stage though!)

The tag for the event is mashlib08 if you want to use on posts, tweets, etc.

Using Twitter to Live Blog ILI08 – some thoughts

Following some comments and feedback today on my live blogging from ILI08, I thought I’d post a few more thoughts on this.

I found using Twitter a pretty good way of posting to the live blog. When I lost wifi access on my laptop and switched to using my iPhone for a bit, there was absolutely no problem – I could just keep blogging.

Rather than using my normal twitter account to do the blog I setup a new account that was meant to be event specific. This allowed me to keep the live blog tweets discrete from my personal tweet, and meant I didn’t overwhelm my followers with tweets (except those who chose to follow the new account). However, I did find that Twitter expects a unique email address for each account – and the idea of setting up a new twitter account, and a new email for each event I want to blog is not appealing. So I suspect that I just need a twitter ‘live blog’ account alongside my personal one.
One final thing was that some of my existing followers sent comments to my normal twitter account, so I retweeted to my live blogging account – clearly there is scope for some confusion here, and another reason for not having more twitter account than absolutely necessary.

One of the other issues I wanted to tackle was differentiating between reporting what the speaker was saying, and my own comments on it. I had an idea that by using my personal account alongside my live blog account, I could differentiate between these two things. However, this felt a bit artificial, and I think risks losing the ‘voice’ from the blogging – I’m not sure I would use this device again.

One of the commenters on the UK Web Focus post that started me off thinking about Twitter for live blogging pointed at a service http://livetwitting.com/. I think this used in conjunction with a dedicated ‘live blogging’ Twitter account could well be a great solution – I’ll try to remember to give it a go next time I’m at an event – I especially like the way it supports annotating the blog with session and speaker names (and Q&A bits). The other thing is that since you are doing it via Twitter, even if livetwitting doesn’t work so well you’ve still got the twitter stream.

One of the other things I liked about the idea of using twitter was that it would be possible to manipulate the output, and this was true to a certain extent. My preferred way of extracting the liveblog was using the Twitter search API – I used a search for all tweets from the ostephensili08 account, and all tweets referencing the account – the syntax is extremely simple, and you can output results as atom or json. However, one issue is that you can output a maximum of 100 tweets at a time, and there doesn’t seem to be a way of knowing how many tweets in total have matched your search result – so when pulling these results together I have had to manually work out I have 4 pages of results.

I pulled together these 4 pages of results into a single RSS feed/JSON file of results using Yahoo Pipes. However, in some cases (see below) using the twitter search results in their raw atom format.

Chris Keene left a comment in my last post suggesting the use of FriendFeed – I need to have a look at this and see how it works. Chris also shows how Dipity can be used to display the twitter stream – so thanks to him I’ve setup an account and used the Twitter search api to bring in all tweets from @ostephensili08 and any replies sent to this account (which is mostly me talking to myself) – there seems to be a problem with Dipity consuming my merged results set via Pipes, so I’ve just used the raw atom feeds from the search api, giving dipity 4 URLs as RSS sources.

The Timeline is perhaps the obvious way of outputting the results – but, I found the map display very interesting as well, as although I only had a few tweets with places in them, I actually found it interesting to have these picked out and see what the context was.

If anyone has any other visualisation suggestions, or ways of displaying the output, leave a comment

Link to my dipity account dipity / ostephens

Timeline

Map (only Tweets mentioning places)

ILI08 Liveblog

I was at ILI08 today and decided to try out a live blogging experiment using Twitter (as described in this post).

I had some issues with the wi-fi during the day, and for a bit was reduced to blogging on my iPhone, but in general I was able to post to twitter quite well. Using http://search.twitter.com I was able to get the output in a few formats. At this point I was unsure the best way of actually presenting this on-screen as a live blog – I was thinking of something similar to the CoverItLive format, but wasn’t sure how to acheive it.

I fiddled around a bit with Yahoo Pipes to bring together a few separate searches from twitter search (because it limits the number of results returned in a single search to 100 tweets) and also used this to sort the tweets into the correct date/time order. Having done this I could get the results as a singleĀ  RSS (http://tinyurl.com/43glub) or JSON (http://tinyurl.com/4zeygw) – but I was unsure the best way to display them in an easily consumable format – I was thinking of something like the CoverItLive format.

I’ve tweeted for some help with this, and if you have suggestions, you could leave a comment here, but what my first attempt is using Grazr, and here is a Grazr gadget displaying the liveblog account, with any comments directed at the live blogging account by other twitterers using the @ostephensili08 syntax.

Grazr

Internet Librarian International 2008

Tomorrow I'm attending Internet Librarian International 2008, and I thought I'd try out a different approach to live blogging.

I've been following some of the live blogging that Andy Powell has been doing at http://efoundations.typepad.com/livewire/ and I had some issues with the CoverItLive tool that Andy was using. A discussion about approaches to live blogging then started on Brian Kelly's UK Web Focus Blog, and I suggested that an alternative approach would be to have a Twitter account for the event, and use that as the live blog feed. I felt this approach would have some advantages – I said:


I can see that not everyone would want to have an RSS item pop up
everytime you finished a sentence, but it would allow much more
flexibility in terms of use:

Pull together all live streams across several live blogs
Searchable after the event
Archivable via an RSS aggregator
Users could choose their own client to view live stream
Can be consumed as push or pull model depending on user choice
Can be read off-line

I’m sure there are other possibilities – the point is that RSS is generally more re-usable etc.


Andy identified some problems for him in using Twitter – particularly the inability to go back and edit the transcript later.

So, since this is the first event that I've been to since that discussion, I thought I'd give it a go.

If you want to comment or contribute to the live blog, get yourself a twitter account by signing at http://twitter.com and then post a tweet starting '@ostephensili08'

You can either view the 'pure' stream – just me tweeting as follows:

Follow ostephensili08 on Twitter
View the Twitter stream at http://twitter.com/ostephensili08
Subscribe to the Twitter RSS feed at http://twitter.com/statuses/user_timeline/16794267.rss

Alternatively you can view the stream with others comments (recommended – bound to be more interesting than just me)

View the Twitter stream at http://search.twitter.com/search?q=from%3Aostephensili08+OR+%40ostephensili08
Subscribe to the RSS feed at http://search.twitter.com/search.atom?q=from%3Aostephensili08+OR+%40ostephensili08

We'll see how it goes!

A month of living paperlessly?

A few things have come together this week which has resulted in me launching myself on an experiment to see how far I can go to living (perhaps more accurately working) without paper.
Firstly, the following exchange on Twitter:
Twitter conversation about ebooks

I had been thinking about e-book readers, as I’d had a very brief chance to play with the new Sony e-reader a couple of weeks ago, I’ve recently installed an e-book app on my iPhone, and this week at work we had a couple of ‘New Technology’ workshops where we had some hands-on play time with various pieces of kit, including an iLiad e-book reader which the library bought last year.

So, I’ve decided to see how far I can use the iLiad to replace stuff I currently print out. I may try some other bits and pieces as well (e.g. RSS feeds using http://www.feedbooks.com/ – thanks again to psychemedia and Twitter for bringing this to my attention – actual e-books, and Newspapers formatted for the iLiad etc.)

So, starting today, for one month, I’m going to give it a go, and see how it works out – I’ll report around November 3rd.
For the moment, here are some first impressions of using the iLiad for this purpose:

I have to admit that the first thing that strikes me is that the iLiad hasn’t gone out of its way to make it easy to use. For a start, the main ‘manual’ doesn’t cover actually getting content from my PC onto the device – for that I have to install special (PC only) software, and read the manual that goes with that.

When I plugin the device, it allows me to ‘name’ the device – but only 11 characters long! It creates a folder structure with the main folder having the same name (so I now regret naming it ‘IC Library’ rather than ‘iLiad’ or similar, because in my folder structure this would have been more obvious). It then (for some reason) also gave an error saying it couldn’t create the folder – even though it actually did this successfully it seems. It also actually creates a whole load of folders – which is confusing to say the least:
Illiad folder structure

For some reason there is a nested structure that replicates the main one – this doesn’t seem to actually do anything though!

I also install the client software (Windows only), and this isn’t a brilliant design. Interestingly it does have a ‘print clipboard to iLiad’ function – I’ll come to this later.

The easiest format to work with for documents seems to be pdf, although the iLiad also supports a number of e-book formats of course. So, I need a PDF creator – I opt for the free CutePDF – and after installing a PS2PDF converter (available free from the same site), and the CutePDF software, I’m ready to go.

The way CutePDF works is it creates a ‘printer’ which when you print to it, creates a pdf instead. As @paulmiller pointed out on Twitter – with Macs you get this functionality in the system out of the box…

Coming back to that ‘print clipboard to iLiad’ function in the client software, you have to have a ‘pdf printer’ installed, so now I’m ready to go with CutePDF. I setup the client software to use this, and find that what actually happens is it prompts me to save the file – so this turns out to be no different to just using the CutePDF printer option!

I then decide to do a trial run – first I just use an open Excel doc and print it to PDF – CutePDF prompts me to save the PDF, and I do so in one of the iLiad Outboxes. Then I sync the iLiad with the PC. To see if this has been successful, I have to unplug the iLiad, and then navigate to the document menu – it isn’t there. I then try again, and print the pdf to a different Outbox folder – sync again – success!

However, you know what Excel files are like – this straddles 2 pages printed in portrait. So I do the print again, this time in landscape, and after some further problems getting the sync to work, I finally manage to get this onto the iLiad. However, on the iLiad, although it displays OK, it displays the landscape print in portrait – so very small! You can re-orient into landscape on the iLiad, which I do – finally, success and I’ve got the output I wanted in the first place.

Perhaps I started with too hard a test? But if this is going to work for me, I need to find this kind of quirk and work round it. Deleting docs from the iLiad works in an equally bizarre way – but I’m not going to go into the detail here – suffice to say, it isn’t intuitive (for me), and takes me about 10 minutes, and the manual to workout how to do this.

We’ll see how this all goes – and I’ll post further updates here. I want to limit this trial to 1 month so that others in the library can try the reader – so Imperial Library staff, if you are interested then let me/IT Team know…