`cross`

function that I literally just learnt about. I made a silly mistake not reading the instructions quite carefully enough which led to an “off by one” error, and I faffed a bit getting the sum (as an aside it’s a shame that a sum across rows is required so often in these problems just to give a simple number to type in as an answer – this isn’t OpenRefine’s strength and it isn’t really the crux of the problem).
]]>The input for the problem was a list of instructions telling you how the head of the rope was being moved, you had to calculate which points the tail went through as a result.

The rules themselves are a little complicated but I think I made progress in writing some GREL that would correctly output the next location of the tail…. however, to predict the next location of the tail I needed to first calculate the location the tail was currently at. The first move is fine – we can treat the starting point as (0,0) on a 2D grid and if we know where the head of the rope has moved we can work out where the tail will end up. However, to work out where the tail ends up on the second ‘move’ we have to have already worked out where it ended up after the first move – which essentially limits us to calculating one row of data at a time – so with a list of 2000 instructions it would mean applying adding a value at a time to the ‘tail location’ column – so 2000 actions to update the column – which I could do manually or perhaps find a way of automatically generating the same instruction repeated 2000 times to fill out an operation history JSON – but it all feels really messy.

So several days later I’ve decided it’s time to move on and declare this one a loss for me (and OpenRefine) for now. Possibly others might be able to work out a better way of using OpenRefine to tackle this, and maybe I’ve missed a trick in the problem that might let me solve it – but I’ve hit my limit on this and feel it simply reflects the limitations of OpenRefine. Of course I could always write a python program inside OpenRefine to solve this – that’s always a possibility but it’s not the spirit of what I’m trying to do here

]]>The video below skips the actual code I used to work out part 2 so if you’re interested it looked a bit like this:

```
with(rowIndex-row.record.fromRowIndex,position,
filter(
forEachIndex(
row.record.cells.treeHeight.value.slice(position+1),i,v,if(value<=v,i+1,null)
)
,j,j!=null)[0]
)
)
```

This code goes through a record tree heights (one record per column or row) and then finds all the trees either before or after the current position (this is what the ‘slice’ does in the middle of this GREL) and then finds the position along from that of each taller tree in the list, and then extracts the first of those positions – which gives us the number of trees you can go along the row/up or down the column before you find a tree the same size or bigger.

]]>This is definitely not the right kind of problem for OpenRefine to solve but it turned out it was possible – just about. I wonder if I missed a trick somewhere along the line and there could have been a simpler approach – after all the simpler question of “how big was each directory” was trivial in OpenRefine – it was the fact that we had to calculate the path to each directory based on the list of ‘change directory’ commands that made it difficult.

]]>`value.split(//)`

to get an array of letters from a string, as well as:
That’s quite a list of functions for something I thought was relatively straightforward!

There’s also a more appropriate use of the OpenRefine Record mode in the second part (although I still revert to the horrible hack used in day 2 to add together all the numbers in a column). Creating the records in the second part includes a neat hack to group together a set number of rows which works easily because of a quirk of the way OpenRefine handles numbers which I completely fail to explain properly in the screen capture!

]]>`cross`

function in GREL, and a horrible hack to add up all the numbers in a column (really not recommended in real life! If only I’d got around to doing some work on this suggested new “Statistical” numeric facet there would be a much better solution!)
]]>

…an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like. People use them as interview prep, company training, university coursework, practice problems, a speed contest, or to challenge each other.

My son had started working on it this year, and inspired me to have a go as well. However, rather than following the usual route of solving the puzzles by writing computer programmes, I’ve decided to try to solve each one using the OpenRefine software which is designed for working with large data sets.

After a few days I can already tell you OpenRefine is definitely not always the right tool for this kind of job… but that’s OK. This is just a fun challenge and the point (for me) of doing in OpenRefine isn’t to find the best of most efficient way of solving the problem, but rather to have fun, challenge my brain, and explore the abilities, and limitations, of OpenRefine along the way.

Each day there are two challenges, and I’m going to try to post a short video for each day I try and see how far I get with the two challenges in OpenRefine. I’m playing catchup at the moment and have already solved the first few days but I’m only just getting round to recording a video…. so here is Advent of Code 2022, Day 1, solved using OpenRefine

]]>