Something kind of OO

My first experience of programming was on a BBC B (for those old enough to remember), using BBC Basic. I didn’t do computing/computer science at school, but I was interested enough to move from typing in program listings from books and magazines to writing some basic programs myself – the one I remember particularly would tell you which day of the week any given date fell on.

Some years later my first ‘real world’ experience of any kind of programming was writing macros in WordPerfect – macros are a way of automating a set of commands or key strokes (often still very useful, I’d really recommend looking at tools like ApplescriptAutoHotkey and MacroExpress – they can really help simplify tasks, and sometimes deliver significant time savings). Most Macro languages also support some kind of ‘logic’ allowing you to only carry out parts of the macro when certain conditions are true.

After another gap, my next step on the ladder was using Perl. I initially picked this up because I was working with applications that were written in Perl, and as I started to use it, it felt very familiar – taking me back to my experience on the BBC. I also found the active community around Perl meant that when I hit problems, there was almost certainly help on hand. Dealing with XML especially I found there was a good tool set already available for me to pick up and start using.

By this point all the programming I’d done was procedural. In procedural programming you write a set of ‘procedures’ or ‘routines’ which are (to a large extent) self contained sets of instructions. As you go through a program you can ‘call’ a procedure whenever it is needed – once all the code in the procedure has run, the program picks up from the point you initially called the procedure to run.

It was probably around this time that I thought I ought to really apply myself to learning a programming language properly, and so I picked up a few books on C and C++, and started to read about an alternative to procedural programming – ‘Object Oriented Programming‘ (OOP). Although OOP had been around for a while, it was probably the mid 90s that it started to become widely used.

To be honest, I struggled. I couldn’t get my head around the OOP concept, and at the same time C and C++ were much more difficult to get to grips with than the languages I’d previously used. I didn’t get anywhere, eventually gave up, and stuck with Perl.

Although Perl is often used for procedural programming, it can also be used with an object-oriented (OO) approach and where I was working with code that others had written, I did sometimes use an OO approach – but really without understanding properly what I was doing, and relying on copying examples and practice from others.

As I started to do jobs that were less ‘hands on’, I hardly got time to do any programming, and it wasn’t until last year that I decided I’d find myself some ‘hobby’ projects I could do for fun. Having done a few of these (e.g. Read to Learn and What to Watch) in Perl, I thought it might be time to try something new again. Rather than heading back to C++ or Java, I decided I’d try to take (what I hoped would be) a smallish step – and was left choosing between two languages – Ruby and Python. Both had a reputation for being relatively easy to pick up, and also for enabling you to get stuff done quickly (I really liked this idea).

Having looked at both, and kicked their tyres, I eventually opted for Ruby. I didn’t have very strong feelings about which way to go, but my initial look suggested to me that I’d find Ruby easier – it looked a bit like Perl to me, whereas Python reminded me more of C (not that I’m shallow and go just by looks), a few people recommended it to me, and there was an active community – including people using it for library type stuff (Blacklight is written in Ruby). I hope that at some point I might have a closer look at Python – one thing that did appeal was the fact that the Google App Engine supports Python, making it possible to launch a Python based app without needing to host it on a server somewhere.

The other thing about Ruby is it is often described as ‘completely Object Oriented’ – I was never entirely clear what was meant by this, but as one of my aims was to get to grips with the concept of OOP, it seemed like this was a good place to start.

Having decided to go with Ruby I found a couple of online tutorials (http://tryruby.org/ lets you actually do some Ruby live online straight away, while Ruby in 20 minutes talks you through the basics) and worked my way through them to get the hang of the basics. I also invested in O’Reilly’s “The Ruby Programming Language” on my iPhone – at £2.99 (compared to an RRP for the print edition of £30.99, and currently on Amazon at £18.99!) I think this is really good value, and although I am limited to using it on the iPhone, in this case I’m generally using it like a reference work, and it’s quite nice to use alongside my laptop.

I’ve always found that the only way I really engage with a programming language is to try to use it in reality – tutorials are fine for basic familiarity, but I’m much happier when I’m trying to solve my own problems – and also doing a representative project means I focus on the parts of the language that are really useful to me. So having recently written What to Watch in Perl, I thought a nice easy exercise for me would be to rewrite it in Ruby – it’s only a couple of hundred lines of code but does several tasks I’m likely to do in other places such as retrieve data from web services in XML format and output RSS.

One of the first things I realised was that although Ruby is an OO language for a simple script such as the one I was doing it would be perfectly possibly to take a very procedural approach to programming using Ruby. The question of whether you use an OO or procedural approach is really about how you think about what you are doing, and how you model your data.

Up until this point I’d been familiar with two types of ‘data structure’ – ways of storing data within a program. These were Arrays and Hashes. Arrays are simple lists of things, whereas Hashes are lists of pairs – each pair consisting of a key and a value. The idea of a hash is that you can lookup a value for any given key.

Just for illustration, if you wanted to store a list of ISBNs in a program, you could do this as a an Array which would look pretty much as you’d expect – e.g. (9780671746728, 9780671742515, 9780517226957).

On the otherhand if you wanted to describe a book you might do this using a hash – looking something like:

{ author => Adams, Douglas, title => Dirk Gently’s Holistic Detective Agency, ISBN => 9780671746728 }

You can create more complex structures by mixing an matching these – for example you could have an array of hashes to represent a list of books with detailed metadata, and within this you might even have some of the hash values as arrays – e.g. to represent a list of authors. You can imagine that this can quickly get confusing!

The thing about this approach is that it is easy to evolve these structures as you go along – there is nothing to stop you adding a new ISBN to the list in the Array, or adding a new key/value pair to the hash – if you wanted to record the publisher for example. This is also a problem, as it means you can easily lose track of what you are storing where, or do nonsensical things (e.g. add a ‘director’ key to the hash which is meant to describe books rather than films).

Ruby (and other OO languages) don’t abandon the concepts of arrays and hashes – and Ruby supports both of these. However at the heart of an object-oriented approach is the idea of an ‘object’. The big realisation for me was that an Object both provided a new kind of data structure tied together with various ways of manipulating the data (there is a short paragraph on Wikipedia comparing procedural programming with OOP)

Where I would have previously (for example) used a hash to store the details of a book, I can now define a type of object called a ‘book’ and in that definition I can setup a set of properties that a book has – such as Author, Title, ISBN. This formalises something that would have been much more informal if I’d just used the approach of using a hash to store this information, as described above.

As well as having a data structure, Objects also have ‘methods’ – things that they can do. In practical terms a method is a (generally) self contained piece of code, that does something – not totally unlike a procedure as I described earlier. However ‘methods’ are linked specifically to objects – so you can restrict the types of thing you can do to an object by only defining the relevant methods.

The terminology around this can get a bit confusing – a quick summary:

  • Class – this is a ‘type of object’ – a general definition which says what properties and methods are linked to an object – so you might define a class of ‘book’
  • Object – a specific instance of a class – that is, if you had a ‘book’ class, any particular book would be described by an object
  • Method – an action tied to a class of object

Thinking of sensible examples is always difficult (for me) and I’m not sure the following stands up to closer scrutiny, but I hope it demonstrates the ideas ok. Lets say you have a library with books you can loan, and reference books that you can’t loan. In a procedural language you might achieve this by having a hash that stored the details of a book, perhaps including a ‘reference’ value – so you could store loanable and reference books like this:

{ author => Adams, Douglas, title => Dirk Gently’s Holistic Detective Agency, ISBN => 9780671746728, Reference => no }

{ author => Adams, Douglas, title => Long Dark Teatime of the Soul, ISBN => 9780330309554, Reference => yes }

You could then write a procedure that loaned the book by linking the hash describing the book to a description of a library patron. You’d then have to add in some kind of test to check the value of the ‘Reference’ key in any book hash before you ran the ‘loan’ procedure. If you forgot to run this check at any point, as to all other intents and purposes a loanable book and a reference book are the same, running the ‘loan’ procedure on a reference book would simply result in the reference book being loaned – there would be nothing else to stop  this happening.

If we look at an Object Oriented approach to this, instead of having a hash to store the information about each book, we would have ‘objects’ to do this. We could have one type of object (class) for loanable books, and another for reference books. We wouldn’t need to have the extra ‘Reference’ value as in the hash above, because you could easily tell which was a reference book, because it would belong to a different Class of object. Additionally to this, because any ‘methods’ you can use are linked to the type of object (Class), you would simply define the ‘loan’ method (which would do a very similar thing to the ‘loan’ procedure above), which was linked to the ‘loanable book’ class only. You would then literally be unable to loan a ‘reference book’ type object – it would simply result in an error.

So taking an object oriented approach can really help in keeping control of your code, and help in making bugs more obvious and easier to trackdown (or avoid completely). There is an overview of OO thinking in this Ruby user’s guide which I think is useful, and I also found this OOP tutorial really helpful (although the examples are in C++ and Java, rather than Ruby)

As I started to grapple with these issues I quickly realised that using objects formalises what you are doing much more, and makes you think a lot harder about what you are trying to do and how you are going to do it right at the start of a project. It also forces you to think through how you are modelling your data much more carefully. Going back to the previous example you probably don’t want to have two completely separate classes for loanable and reference books – they are both books after all, and will have a lot in common with each other – the only difference being you can loan one and not the other. OOP allows for this by supporting the idea of ‘classes’ and ‘subclasses’ – you can have a general a general class – lets say ‘book’ with the relevant properties and methods attached – but only properties and methods that would apply to any book – so not (in this example) the ‘loan’ method. You can then have two subclasses – the ‘loanable book’ and ‘reference book’ classes – which would ‘inherit’ all the properties and methods from the more general ‘book’ class. You would then add an additional method to the ‘loanable book’ class to enable it to be loaned – and obviously you would not add this to the ‘reference book’ class.

This approach forces you to think through exactly what things you might want to do different classes of object right from the start, and what properties you need any particular object to have. I found it made me think more ‘abstractly’ about the types of data I was dealing with. For example, if you were looking at library data, you might start thinking about books, and define classes as I’ve just described. However, then you realise that you also have DVDs you want to loan out – and that while DVDs share somethings in common with books (they have titles, they can be loaned), they also have a number of different traits (they have directors) s0 you might start to think about modelling your library a more abstract class (e.g. ‘Library Item’) which might setup some basic properties and methods (e.g. they have a title, they can be added to a location), and then have more definite classes for ‘books’, ‘DVDs’ and if there was a new item type added to your stock at a later date (e.g. ‘journals’) you could add a new subclass as appropriate.

This type of modelling is hard work! Even with a relatively simple task I found thinking about this took up a lot of time – and really needed to be done before I could make much of a start on actually coding. This problem of modelling brings me back to my recent post What’s so hard about Linked Data – how data is structured, and how it behaves, is at the heart of this – no matter whether you do this in software or in data schemas.

Finally, where next with my programming journey? I really enjoyed starting to learn Ruby, and I think I’ll try to do some more work with it – I’m especially interested in looking at ‘Rails‘ which is a ‘web application framework‘ – designed to make it easier to develop web applications quickly and easily (and has another new (for me) concept for me to get my head round – MVC – Model-View-Controller)