Middlemash, Middlemarch, Middlemap

October 14, 2009 · ostephens

The next Mashed Library event was announced a few months ago, but now more details are available. Middlemash is happening at Birmingham City University on 30th November 2009. I hope to see you there.

In discussion with Damyanti Patel, who is organising Middlemash, we thought it would be nice to do a little project in advance of Middlemash. When we brainstormed what we could do I originally suggested that maybe someone had drawn a map of the fictional geography of Middlemarch, and if we could find one, we could make it interactive in some way. Unfortunately a quick search turned up no such map. However, what it did turn up was something equally interesting – this map of relationships between characters in Middlemarch on LibraryThing.

This inspired a new idea – whether this could be represented in RDF somehow. My first thought was FOAF, but initially this seemed limited as it doesn’t allow for the expression of different types of relationship. However, I then came across this post from Ian Davis (this is the first in a series of 3), which used the Relationship vocabulary in addition to FOAF to express more the kind of thing I was looking for.

The resulting RDF is at http://www.meanboyfriend.com/overdue_ideas/middlemash.rdf. However, if you want to explore this is a more user-friendly manner, you probably want to use an RDF viewer. Although there are several you could use, the one I found easiest as a starting point was the Zitgist dataviewer. You should be able to browse the file directly with Zitgist via this link. There are however a couple of issues:

Zitgist doesn’t seem to display the whole file, although if you browse through relationships you can view all records evenutally
At time of posting I’m having some problems with Zitgist response times, but hopefully these are temporary

This is the first time I’d written any RDF, and I did it by hand, and I was learning as I went along. So I’d be very glad to know what I’ve done wrong, and how to improve it – leave comments on this post please.

I did find some problems with the Relationship vocabulary. It still only expresses a specific range of relationships. It also seems to rely on inferred relationships in some cases. The relationships uncle/aunt/nephew/niece aren’t expressed directly in the relationship vocabulary – presumably on the basis that they could be inferred through other relationships of ‘parentOf’, ‘childOf’ and ‘siblingOf’ (i.e. your uncle is your father’s brother etc.). However, in Middlemarch there are a few characters who are described as related in this manner, but to my knowledge no mention of the intermediary relationships are made. So we know that Edward Causubon has an Aunt Julia, but it is not stated whether she is his father’s or mother’s sister, and further his parents are not mentioned (this is as far as I know, I haven’t read Middlemarch for many years, and I went from SparkNotes and the relationship map on LibraryThing).

Something that seemed odd is that the Relationship vocabulary does allow you explicitly to relate grandparents to grandchildren without relying on the inferrence from two parentOf relathionships.

Another problem, which is one that Ian Davis explores at length in his posts on representing Einsteins biography in RDF is the time element. The relationships I express here aren’t linked to time – so where someone has remarried it is impossible to say from the work I have done here whether they are polygamous or not! I suspect that at least some of this could have been dealt with by adding details like dates of marriages via the Bio vocabulary Ian uses, but I think this would be a problem in terms of the details available from Middlemarch itself (I’m not confident that dates would necessarily be given). It also looked like hard work 🙂

So – there you have it, my first foray into RDF – a nice experiment, and potentially an interesting way of developing representations of literary works in the future?

Bathcamp

September 30, 2009September 30, 2009 · ostephens

Last weekend, I went to Bathcamp, a barcamp style event, but slightly unusual as it actually included camping. Although I don’t live particularly close to Bath, I knew several of the people involved – mainly via Twitter (at least initially).

After I booked, I suddenly had the idea that rather than drive down to Bath, I could instead do a combination of cycling and taking the train. I had one days holiday to take before the end of September, so I decided to set off on Friday morning aiming to get to the campsite in time to get my tent up before the sun went down.

I set off slightly late after a last minute search for the keys to my bike lock, and headed from Leamington Spa down to Moreton-in-Marsh. I was aiming to get to Moreton in time to get the 10:48 train – I had just under 3 hours to do about 25 miles. As I went along I tweeted – starting with this tweet. What I wasn’t aware of was a whole other twitter conversation going on around me.

Unfortunately I made it to Moreton-in-Marsh just in time to see the train I wanted pulling out. So, I stopped for an early lunch (BLT and Chips) in the local pub, and got the 12:48 train to Bath. I’d originally intended to go to Chippenham by train and cycle from there, but I decided I might not make it to the campsite before sundown, and that going to Bath was a safer bet. I tweeted that I was going that way, and got an offer of some company for a bit of the way from Andy Powell – which was extremely welcome as he was able to show me a canal-side route that avoided the huge hill outside Bath.

The weekend included a huge variety of talks – from an introduction to jQuery to Libraries (me), from HTML Email to making music with Ableton Live, as well as films, live music, barbecued dinner and breakfast and the odd sip of cider.

A couple of the talks I managed to make some reasonable notes about – and it surprised me they were both very relevant to my work. The first one was by Giles Turnbull, and was about the use of URL shorteners – Giles said that he was responsible for the original idea which, with some help from other people, became makeashorterlink.com. Giles described how they really didn’t anticipate the level of abuse that the service would get from spammers. However, despite this they kept it going for a couple of years. Then for various reasons – changes in lives and locations – they decided they could no longer maintain the service – they asked if anyone wanted to take over the service, and the fledgling service TinyURL took it over.

The issue that Giles wanted to highlight was that really the service relied on the enthusiasm of a few individuals – and he felt that this was essentially true of all online services. This, combined with the experience of finding old papers belonging to his step father (I think), made him realise how emphemeral what he put online was compared to paper. He said he was excited by the idea of Newspaperclub which is a service (currently in Alpha) to create a printed ‘newspaper’ from your online content – something you can keep, or give as a gift.

I’m not convinced by this – the solution to digital preservation can’t ultimately be to print it all out – and as Cameron Neylon pointed out, this is a form of caching rather than preservation – online content isn’t like printed content.

Giles’ talk provoked some discussion – but mainly about the longevity and economic viability of various Internet companies – which for me isn’t the heart of the problem. Even if companies survive, the question of how my grandchildren will access, say, my photos saved as JPEGs is far more of an issue.

The second talk I took notes from was Chris Leonard from BioMedCentral (bit hard to believe at this point, but this really wasn’t a library conference!). Chris spoke about how scientific publishing was gradually creeping outside the journal – to blogs, video and other media – but that it was difficult to keep track, and also difficult for scientists to be ‘rewarded’ for these routes of publication (in the way they are recognised and/or cited when they publish in journals).

Chris suggested an approach like that taken by Friendfeed or Faculty of 1000, which I’ve not come across before. He listed some pros and cons of these different services and suggested that what was needed was a service:

that is open and free
uses metrics to motivate contributors (RAE-worthy metrics)
rewards contributors for their efforts
archives contribution and discussions – making them citable

Chris suggested this approach would mean:

Scientist’s whose work is not suited to being shoehorned into a pdf may no longer need to write an article
The interconnected web of data could lead to new ‘article’ types
Unpublished research could reach a wider audience [where it is merited] and discredit crackpots

He suggested that “Peer-review Lite” should be able to sort the wheat from the chaff – if not replace the usefulness of traditional peer-review.

I think Chris is right there is a need to look at new forms of publication and how the effort put into these is recognised and rewarded. However, I also think this is a big challenge – it means changing attitudes towards how academic discourse is conducted, which will be hard to do.

On Sunday morning I skipped out early to cycle back to Bath – a beautiful ride across country, and then along the Kennet and Avon canal – and took the train home. Thanks again to all who organised, especially Mike Ellis, and all those who sponsored an excellent event.

Twitter â€“ a walk in the park?

September 10, 2009September 30, 2009 · ostephens

This week I’ve been at the ALT-C conference in Manchester. One of the most interesting and thought provoking talks I went to was by David White (@daveowhite) from Oxford, who talked about the concept of visitors and residents in the context of technology and online tools.

The work David and colleagues have done (the ISTHMUS project) suggests that moving on from Prensky’s idea of ‘digital natives and immigrants’ (which David said had sadly been boiled down to in popular thought as ‘old people just can’t do stuff’ – even if that wasn’t what Prensky said exactly), that it was useful to think in terms of visitors and residents.

Residents are those who live parts of their life online – their presence is persistent over time, even when they aren’t logged in. On the otherhand Visitors tend to log on, complete a task, and then log off, leaving no particular trace of their identity.

The Resident/Visitor concept isn’t meant to be a binary one – it is a continuum – we all display some level of both types of behaviour. Also, it may be that you are more ‘resident’ in some areas of your life or in some online environments, but more a ‘visitor’ in others.

I think the most powerful analogy David drew was to illustrate ‘resident’ behaviour as people milling round and picnicing in a park. They were ‘inhabiting’ the space – not solving a particular problem, or doing a particular task. It might be that they would talk to others, learn stuff, experience stuff etc. but this probably wasn’t their motivation in going to the park.

On the otherhand a visitor would treat an online environment in a much more functional manner – like a toolbox – they would go there to do a particular thing, and then get out.

David suggested that some online environments were more ‘residential’ than others – perhaps Twitter and Second Life both being examples – and that approaching these as a ‘visitor’ wasn’t likely to be a successful strategy. That wasn’t to pass judgement on the use or not of these tools – there’s nothing to say you have to use them.

David also noted that moving formal education into a residential environment wasn’t always easy – you can’t just turn up in a pub as a teacher and start teaching people (even if those same people are your students in a formal setting) – and that the same is true online, An example was the different attitudes from two groups of students to their tutors when working in Second Life – in the first example the tutor had worked continually in SL with the students, and had successfully established their authority in the space. In the second example a tutor had only ‘popped in’ to SL occasionally, and tried to act with the same authority – which grated on the students.

At the heart of the presentation was the thesis that we need to look much more at the motivations and behaviours of people, not focus on the technology – a concept that David and others are trying to frame – currently under the phrase ‘post-technical’. Ian Truelove has done quite a good post on what post-technical is about.

Another point made was that setting up ‘residential’ environments could be extremely cheap – and you should think about this when both planning what to do and what your measures of ‘success’ are – think about the value you get in terms of your investment.

The points that David made came back to me in a session this morning on Digital Identity (run by Frances Bell, Josie Fraser, James Clay and Helen Keegan). I joined a group discussing Twitter, and some of the questions were about ‘how can I use Twitter in my teaching/education’. For me, a definite ‘resident’ on Twitter, this felt like a incongruous question. I started to think about it a bit more and realised, there are ‘tool’ like aspects to Twitter:

Publication platform (albeit in a very restrictive format)
Ability to publish easily from mobile devices (with or without internet access)
Ability to repurpose outputs via RSS

This probably needs breaking down a bit more. But you can see that if you wanted to create a ‘news channel’ that you could easily update from anywhere, you could use Twitter, and push an RSS version of the stream to a web page etc. In this way, you can exploit the tool like aspects of Twitter – a very ‘visitor’ approach.

However, I’d also say that if you want to do this kind of thing, there are probably better platforms than Twitter (or at least, equally good platforms) – perhaps the WordPress Microblog plugin that Joss Winn mentioned in his session on WordPress (another very interesting session).

For me, the strength of Twitter in particular is the network I’ve built up there (something reinforced by the conference as I met some of my Twitter contacts for the first time – such as @HallyMk1, who has posted a great reflection on the conference – although I should declare an interest – he says nice things about me). I can’t see that you can exploit this side of Twitter without accepting the need to become ‘resident’ to some degree. Of course, part of the issue then becomes whether there is any way you can exploit this type of informal environment for formal learning – my instinct is that this would be very difficult – but what you can do is facilitate for the community both informal learning and access to formal learning.

As an aside, one of the things that also came out of the Digital Identities session was that even ‘visitors’ have an online life – sometimes one they aren’t aware of – as friends/family/strangers post pictures of them (or write about them). We all leave traces online, even if we don’t behave as residents.

The final thread I want to pull on here is a phrase that was used and debated (especially I think in the F-ALT sessions) “it’s not about the technology'”. This was certainly part of the point that David White made – that people’s motivations were much more important than any particular technology they would use to achieve their goals. He made the point that people who don’t use Twitter don’t avoid doing so because they aren’t capable, or don’t understand, they just don’t have the motivation to use it.

Martin Weller has posted on this and I think I agree with him when he says “I guess it depends on where you are coming from” – and I think the reason that the phrase got debated so much is that the audience at ALT-C is coming from many different places.

I’m guilty of liking the ‘shiny shiny’ stuff as much as any other iPhone owning geek – but the thing that interests me in this context is what the impact is likely to be on education (or more broadly to be honest, society) – I’m not in the position of being immediately concerned about how the Twitter or iPhones or whatever else should be used in the classroom.

I do think that we need to keep an eye on how technology continues to change because I think a very few technologies impact society to the extent that our answers need to change – but the question remains the same whatever – how are we going to (need to) change the way we educate to deal with the demands and requirements of society in the 21st Century.

IceRocket Tags: altc2009

Scraping, scripting and hacking your way to API-less data

July 7, 2009 · ostephens

Mike Ellis from eduserv talking about getting data out of web pages.

Scraping – basically allows you to extract data from web pages – and then you can do stuff with it! Some helpful tools for scraping:

Yahoo!Pipes
Google Docs – use of the importHTML() function to bring in data, and then manipulate it
dapper.net (also mentioned by Brendan Dawes)
YQL
httrack – copy an entire website so you can do local processing
hacked search – use Yahoo! search to search within a domain – essentially allows you to crawl a single domain and then extract data via search

So, once you’ve scraped your data, you need some tools to ‘mung’ it (i.e. manipulate it)

regex – regular expressions are hugely powerful, although can be complex – see some examples at http://mashedlibrary.ning.com/forum/topics/extracting-isbns-from-rss
find/replace – can use any scripting language, but you can even use Word (I like to use Textpad)
mail merge (!) – if you have data in excel, or access, or csv etc. you can use mail merge to output with other information – e.g. html
html removal – various functions available
html tidy – http://tidy.sourceforge.net – can chuck in ‘dirty’ html – e.g cut and pasted from Word, and tidy it up

Processing data:

Open Calais – service from Reuters that analyses block of text for ‘meaning’ – e.g. if it recognises the name of a city it can give information about the city such as latitude/longitude etc.
Yahoo!Term Extraction – similar to Open Calais – submit text/data and get back various terms – also allows tuning so that you can get back more relevant results
Yahoo!geo – a set of Yahoo tools for processing geographic data – http://developer.yahoo.com/geo

The ugly sisters:

Access and Excel – don’t dismiss these! They are actually pretty powerful

Last resorts:

Use Freedom of Information – for data you can’t get any other way, submit FoI requests via What do they know
OCR stuff (Mike has used http://www.softi.co.uk/freeocr.htm)
Re-key data – or use Mechanical Turk to get people to do it for you?

Somewhere I have never travelled

July 7, 2009July 7, 2009 · ostephens

This presentation by Brendan Dawes – http://www.brendandawes.com/ (powered by WordPress)

Brendan quite into data – “data porn” – visualising data. Saying that much of the web is still designed as if it’s in print.

Making ‘weird creatures’ out of keywords http://www.brendandawes.com/?s=redux – ‘creatures’ size indicates popularity, speed they move depends on age – but this stuff doesn’t come with an instruction manual – there is nowhere that these links between data and behaviour is documented for the ‘end user’ – but just putting it out there, and trying it out.

‘Interfaces’ are important – Brendan likes to collect ideas in ‘Field Notes’ books – http://fieldnotesbrand.com/. Also has a firewire drive full of ‘doodles’ as his ‘digital notebook’ – just bits and pieces of stuff that may do one thing – e.g. a drawing app, that allows you to draw things in black ink – that sat there for ages, he did nothing with it. Then had an idea that he wanted to be able put stuff on lines that he had drawn – found something that someone else had done online – and he had put that on his digital notebook.

Brendan wanted to do something http://www.daylife.com/

(aside – When you design stuff for people, avoid colours – as people can dump a perfectly good idea if you’ve done it in the wrong colour! Use black and white, because it doesn’t upset anyone 🙂

What would happen if we removed interfaces completely? Allowed people to build their own interface?

So – all of these bits and pieces came together as http://doodlebuzz.com/ – allows you to do a search – then you draw a line to see the results displayed.

Memoryshare – a BBC project to share memories. Original version had a rather dull interface – didn’t engage people, so not very good usage – although the content is very compelling when you start reading. Brendan and team did a range of prototypes – very open brief – basically do anything you want.

Took ideas done with the Daylife example – displaying time based events on a spiral line – great ‘wow’ moment when you see the spiral on the screen, and then as you zoom in it becomes obvious that it is a 3d environment – very, very pretty! Original demo was in Flash, which couldn’t cope with the amount of data in memoryshare – but the BBC really liked design, so figured out how to do it – see the results at http://www.bbc.co.uk/dna/memoryshare/ – compare this to the old design at the Internet Archive Wayback Machine.

Brendan now moving onto using data to produce physical objects – mentioned a site I didn’t get (Update: thanks to @nicoleharris got this now http://www.ponoko.com/make-and-sell/how-to-make) that allows you to upload a design and get it made – so for example Brendan has had some wooden luggage tags made with data displayed on them. Moo.com has an API – you can pump data in and get physical objects out. Brendan has written something that takes data from wefeelfine.org and pushes to moo.com to make cards – transfers transient digital data into less transient physical data

Visualisation

July 7, 2009 · ostephens

Iman Moradi is talking about how we organise library stock and spaces – he’s going through at quite a pace, so very brief notes again.

Finding things is complex

It’s a cliched that library users often remember the colour of the book more than the title – but why don’t we respond to this? Organise books by colour – example from Huddersfield town library.

Iman did a demonstrator – building a ‘quotes’ base for a book – use a pen scanner to scan chunk of text from book, and associate with book via ISBN – starts to build a set of quotes from the book that people found ‘of interest’

Think about libraries in terms of games – users are ‘players’, the library is the ‘game environment’. Using libraries is like a game:

Activities = Finding, discovery, collection
Points/levels = acquiring knowledge

Mash Oop North

July 7, 2009July 7, 2009 · ostephens

Today I’m at Mash Oop North aka #mashlib09 – and kicking off with a presentation from Dave Pattern – some very brief notes:

Making Library Data Work Harder

Dave Pattern – www.slideshare.net/daveyp/

Keyword suggestions – about 25% of keyword searches on Huddersfield OPAC give zero results.
Look at what people are typing in the keyword search – Huddersfield found ‘renew’ was a common search term – so can pop up a information box with information about renewing your books.

By looking at common keyword combinations can help people refine their searches

Borrowing suggestions – people who borrowed this item, also borrowed …
Tesco’s collect and exploit this data. Do libraries sometimes assume we know what is best for our users – but we perhaps need to look at data to prove or disprove our assumptions

Because borrowing driven by reading lists, perhaps helps suggestions stay on-topic

Course specific ‘new books’ list – based on what people on specific courses borrow
Able to do amazon-y type personalised suggestions

Borrowing profile for Huddersfield – average number of books borrowed shows v high peak in October, lull during the summer – now can see the use of the suggestions following this with a peak in November.

Seems to be a correlation between introduction of suggestions/recommendations with increase in borrowing – how could this be investigated further?

Started collecting e-journal data via SFX – starting to do journal recommendations based on usage.

Suggested scenario – can start seeding new students experience – 1st time student accesses website can use ‘average’ behaviour of students on same course – so highly personalised. Also, if information delivered via widgets could drag and drop to other environments.

JISC Mosaic project, looking at usage data (at National level I think?)

So – some ideas of stuff that you might do with usage data:

#1 Basic library account info:
Just your bog standard library optionss
– view items on loan. hold requests etc
– renew items
Confgure alerting options
– SMS, Facebook, Google Teleppathy
Convert Karma
– rewards for sharing information/contributing to pool of data – perhaps swap karma points for free services/waiving fines etc.

#2 Discovery service
Single box for search

#3 Book recommendations
Students like book covers
Primarily a ‘we think you might be interested in’ service
Uses database of circulation transactions, augmented with Mosaic data
time relevant to the modules student is taking
Asapts to choices student makes over time

#4 New books
Data-mining of books borrowed by student on a course
Provide new books lists based on this information (already doing this at Huddersfield I think)

#5 Relevant Journals

#6 Relevant articles
– Whenever student interacts with library services e.g. keywords etc. – refines their profile

At one remove

June 1, 2009June 3, 2009 · ostephens

You will have seen from my previous post that I’ve moved this blog recently. There were a few challenges associated with this which I want to document here, but perhaps the first thing to tackle is why I was moving the blog in the first place.

The domain www.meanboyfriend.com came about through a joke between me and my girlfriend (now wife) , Damyanti, about what a mean boyfriend I was, and how she would publish a list of my misdemeanours on a website dedicated to this – meanboyfriend.com (at least, I think it was a joke). When we decided to setup a blog, buying the meanboyfriend.com domain seemed like a good punchline. I can’t remember now whether choosing our own domain name was the result of clear thinking about wanting to own the domain on which our stuff lived or not – but I think in retrospect it was a good decision (rather than simply using the URL provided by Typepad).

At the time (6 years ago) the Typepad blogging platform was getting good reviews, and Moveable Type (which powered Typepad) was one of, if not the, leading blogging platforms. We setup a joint account with Typepad – because of the type of account we have with Typepad, we have a single user – which is our joint account, and all entries on our blogs appeared to be by our amalgamated personality – damyantiandowen. Also the FOAF file that Typepad will automatically create for you was for this joint identity. We are also limited to three blogs on the account.

Having sorted out the technical side, we setup our first blog – Overdue. This was a personal blog, which was aimed primarily at friends and family. We also used the Typepad photo facility to put up photos from holidays etc. To be honest, we’ve never been that great at updating the blog, although we use the photos a lot (and as a result have never really invested in Flickr or Picasa or other similar photo sharing services). After a hiatus of over a year covering the whole of 2008, we decided that we would try to refocus the blog on food/drink stuff – see our explanation at Foods for thought. As this is truly a ‘joint’ blog, entries appearing as authored by damyantiandowen are fine – and although each entry is generally written by one or the other of us, it feels like a joint venture.

Shortly after this, I decided that a professional blog would be useful to record thoughts and ideas relating to my work. I set this up as Overdue Ideas (see what I’m doing here?). This was mapped to a URL we owned, still under the meanboyfriend.com domain (http://www.meanboyfriend.com/overdue_ideas). Although in theory I would have been happy for this to be a joint blog (and I should acknowledge that many of my ideas and posts come out of conversations with Damyanti), in practice I was the only author. This made the joint account a bit of an issue – not on a day to day basis, but just occasionally. Last week I was contacted by someone wanting to quote Overdue Ideas, and unsure whether the quote should be attributed to me or Damyanti. This confusion has happened more than once.

Some years passed with this being the basic situation – 1 account, 2 blogs, some photos and a few bits a pieces were hosted on Typepad and appeared under http://www.meanboyfriend.com. As our account would support 3 blogs, we are currently using the 3rd blog as a protected file space (as far as I can tell on Typepad you can only protect at the blog level, not on individual files or pages) for stuff we only want to share with specific people.

So why move?

Over time, I feel that the Typepad application hasn’t quite kept up with state of the art in blogging, and as I saw what others were acheiving with WordPress I got some tech envy. WordPress supports a huge array of plugins, and Akismet seems to be state of the art as far as catching comment spam goes – although this wasn’t a massive problem on my blog, it was an irritant with one or two spam messages a week to clear out (I should say that this was the stuff that got through – Typepad’s own spam filters caught a lot of spam for me that didn’t make it to the blog)

Also, the issue of our con-fused (see Neal Stephenson) identities was an occasional issue – especially as discussions about online identity moved on I realised that we had a bit of a problem here. As well as the confusion for readers – who was actually writing this blog, and who had authored which post – there were other issues. Typepad automatically creates FOAF files – but for us, this was for our joint identity. Typepad also supports OpenID, but again we got one OpenID between the two of us.

The final push came when Damyanti wanted to setup her own blog – which would have taken us beyond our 3 blog limit.

One solution to much of this would have been to upgrade our Typepad account (from ‘Plus’ to ‘Pro’). This would have allowed us to have unlimited blogs, and unlimited authors. But in the end my techno-lust won the day – I wanted a bit more flexibility, and the ability to do other things (e.g. install other software).

It looked like it was time to move blogging platforms to support our separate identities, multiple blogs and satisfy my techno-lust. Having seen a number of people I know on Twitter mentioning Dreamhost, and getting some good feedback when I asked on how it was working, I decided to go with them as a host. As I’ve already mentioned, I’d been admiring what people could do with WordPress – I was blown away by the iPhone theme that Joss Winn has on his blog (when viewed with an iPhone)

So – you are now reading this blog powered by WordPress, and hosted by Dreamhost. The move was slightly traumatic, but if I can, I’ll document this separately. If you are thinking of doing a similar move (and are of the tech inclination) I’d recommend Rob Styles post on moving from Typepad to WordPress for information on dealing with redirecting URLs etc – something I struggled with (and still haven’t completely dealt with).

A gathering place for UK Research

January 20, 2009 · ostephens

I’m the project director for EThOSNet – which is establishing a service, run by the British Library, to provide access to all UK PhD and Research Theses. The service itself is called EThOS (Electronic Theses Online Service).

Today, EThOS has gone into public beta – without fanfare, the service is now available, and can be found at http://ethos.bl.uk. The key parts of the service are:

A catalogue of the vast majority of UK Research Theses
The ability to download electronic versions where they exist
The ability to request an electronic version be created where it doesn’t already exist

I’m incredibly excited about this – of all the projects I’ve been involved in, although not the biggest in terms of budget (I don’t think), it has the most potential to have an incredible impact of the availability of research. Until now, if you wanted to read a thesis you either had to request it via ILL, or take a trip to the holding university. Now you will now be able to obtain it online. To give some indication of the difference this can make, the most popular thesis from the British Library over the entire lifetime of the previous ‘Microfilm’ service was requested 58 times. The most popular electronic thesis at West Virginia University (a single US University) in the same period was downloaded over 37,000 times. If we can even achieve a relatively modest increase in downloads I’ll be happy – if we can hit tens of thousand then I’ll be delighted.

The project to setup EThOS has been jointly funded by JISC and RLUK, with contributions from the British Library, and a number of UK Universities and other partners, including my own, Imperial College London, which leads the project. The launch of the service is the culmination of several projects, including ‘Theses Alive!‘, ‘Electronic Theses‘, ‘DAEDALUS‘, ‘EThOS‘, and the current ‘EThOSNet‘.

With so much work done before and during the EThOSNet project, my own involvement (which started someway into the EThOSNet project, when I took over as Project Director from Clare Jenkins in autumn 2007), looks pretty modest, so thanks to all who have worked so hard to make EThOS possible, and get it live.

One of the biggest issues that has surfaced several times during the course of these projects, is the question of IPR (Intellectual Property Rights). EThOS is taking the bold, and necessary, step of working as an ‘opt-out’ service. This is based on a careful consideration of all the issues which has concluded:

The majority of authors wish to demonstrate the quality of their work.
Institutions wish to demonstrate the quality of their primary research

In order that authors can opt-out if they do not want their thesis to be made available via EThOS there is a robust take-down policy – available at EThOS Toolkit

As an author, you can also contact your University to let them know that you do not wish your thesis to be included in the EThOS service.

By making this opt-out and take-down approach as transparent as possible (including doing things like advertising it on this blog), we believe that authors have clear options they can exercise if they have any concerns about the service.

Finally, the derivation of the word Ethos (according to wikipedia) is quite interesting ™. There are many aspects of the word that felt relevant to the service – the idea of a ‘starting point’, and the idea that ‘ethos’ belongs to the audience both resonate with what EThOS is trying to do. However, for the title of the post I decided to draw on Michael Halloran’s assertion that "the most concrete meaning given for the term in the Greek lexicon is 'a habitual gathering place'." – which I believe is what EThOS will become to those looking for UK research dissertations.

Technorati Tags: ethos,e-theses

Developer Happiness Days

January 13, 2009 · ostephens

In February JISC is running dev8D – a 3/4 day event (9-13 Feb) aimed mainly at software developers in the UK HE space. There is also plenty of room for interested others at the event – as the website says:

Not an educational developer? Not a problem. One of the most exciting aspects of Developer Happiness Days is that it is gathering developers from a variety of sectors, both inside and outside the UK, to promote collaboration and creativity across different areas of research.

As well as teams from across the spectrum of higher education work, commercial developers are attending from a range of sectors. Providing an opportunity for public sector and commercial developers to meet and network is a key benefit of the event.

Not a developer? Again, welcome! The best code is created through collaboration with the end users and we want the people who will be testing the tools in their everyday work to have their voice heard too. And for the technophile tinkerers among you, there will be workshops to help bring everyone up to speed with the lingo.

I've been involved in the planning of the event (although this mainly consisted of commenting on other people's ideas!), but I hope you'll believe me when I say the final programme looks excellent.

The event is free to attend, and there is also a small amount of free accommodation available.

So, whether you are a developer / hacker / scriptkitty / blogger / observer / lurker / collaborator / user / uber-user / usability expert, get over to http://www.dev8d.org/book.html and sign up!

Technorati Tags: dev8d

Overdue Ideas

Ideas linking Libraries, Computing, E-learning, and anything else that springs to mind.

Tag Archives ⇒ Web/Tech