Stupid Software for Clever People

Around two and a half years ago I wrote a blog post looking at the use of words like computing, code, programming etc in primary school Ofsted reports. I thought it might be a good way to track if computing was on the rise in this age group.

I’ve repeated the exercise with a sample of recent reports and a sample from 2011/12. The disappointing news is that there is no significant difference in the frequency of computing related words.

Clearly it would have been a better story if there had been a difference…. but given that I’d gone to the effort of scraping the text of almost 6000 reports from Ofsted’s website, I thought I should at least look for what the differences actually are. How has the language changed over the last couple of years?

Here are the top 75 most significant differences. The first chart shows words that appear more often in recent reports. The second shows words that appear more often in older reports. Theres some technical stuff below about the method, but what is graphed here is something called the log-likelihood of a difference – basically the bigger the line the more significant a difference there is.

Firstly, I was at least encouraged by the fact that “Mathematics” is mentioned a lot more now (at least thats Computational ). The other thing that jumped out at me was the change from the use of the word “disabilities” to “disabled”… a reflection of a general shift in language in this area?

As for the more educational aspects, I’m not really qualified to comment on how meaningful this analysis is, but would be interested to hear views from teachers.

The technical bits

The idea to use a log-likelihood score came from this paper by Paul Rawson and Roger Garside – However, as always with these things, there was a lot of munging required before being able to use the method.

Ofsted publish reports as PDF’s – these were scraped (painfully) from their site and converted to text using Python and pyPdf for the conversion. The reports contain a lot of boilerplate text and this has evolved over time. To prevent that from influencing the final results I wrote some code to remove the 1000 most often repeated lines from each corpus. Not perfect but it seemed pretty effective. NLTK was used to tokenise the text, remove stop words and do the basic frequency counts and then I coded up the log likelihood scores directly from the paper referenced above. Graphs were done in R.

Share this:

Just over four years ago I found myself having the same conversation over and over again. It would be good to have a regular Meet-up for coder/designer/maker/startup type people in the town I live in?

After having this conversation one too many times I realised that there was only one way to make it happen. I bought ReadingGeekNight.com and wrote a blog post along the lines that if ten people said they would attend and one person agreed to do a talk, then I would organise it.

On that first night four people spoke to an audience of forty or so people – every month since we’ve repeated the formula and four years later we’ve clocked up almost fifty Reading Geek Nights.

However, lots of things have changed for me since we started and organising the event every month has become more difficult. So I feel its time now for me to step aside.

We’ve had an amazing range of speakers talking about an eclectic bunch of topics… Interface Design; Cybercrime; Equality in Tech; 3D printing; and hundreds more. I’m humbled that so many people have given up their time to stand up and share things with us.

It’s a great event that regularly attracts fifty to sixty people with pretty minimal marketing effort and I’d love to think that Reading Geek will carry on without me. Of course, I’ll do whatever I can to help whoever comes forward to get up and running and always be a supporter from the side lines (and enjoy sitting in the audience!!)

My last Geek Night (as organiser) will be on 12th November – I’m hoping that I’m overwhelmed on the night with people who are keen to carry on the tradition and to breathe new life into what’s become an established event in the Reading tech community calendar.

Thanks to everyone who ever came along and especially thanks to everyone who ever spoke at Reading Geek Night. You are all awesome.

Share this:

I’ve been collecting tweets about BBC Question Time to produce these graphs of twitter reaction. As a summary of how twitter users reacted to the programme they work fairly well.

For a while I’ve been wondering about overlaying information gleaned from social media onto the video from the TV programmes as an experiment. Will it add a useful level of analysis? How easy is it to do? Does it make sense when you watch it?

So here’s my first stab, based on data I collected for the programme on 21st March. It shows a rolling graph of positive, negative and neutral sentiment and a dynamic graph of the relative frequencies of the most mentioned words.

Naturally its far from perfect. However (as always with these things) it’s the process of building it, getting feedback and iterating that ultimately improves it and makes it into something thats actually useful to someone.

Comments / Questions / Observations etc welcome!

Share this:

I thought it would be interesting to compare yesterdays UK budget speech with reaction to it on twitter. It’s one of those events where a message is ‘broadcast’ and you can then judge how it was ‘received’ by analysing relevant tweets.

People often use wordclouds for this kind of thing, but there are usually better ways to compare the information. Here is a wordcloud showing what the Chancellor actually said in the house yesterday…

Chancellors Budget Speech

…and here’s one showing all of the tweets using the #budget hashtag made while the Chancellor was speaking.

Twitter Budget Reaction

It’s hard to see the difference. If you spend a long time with it you can pick up words that are larger in one than the other, but it’s hard work. In these cases a simple old bar graph is much easier to interpret. Here’s one which looks at the top twenty or so words (having removed one’s which aren’t useful for a comparison).

This time it’s much easier to ‘spot the difference’. On twitter the words “Duty” and “Cut” featured much more heavily than in the Budget speech. The Chancellor didn’t use the word “Beer” at all. When the Chancellor referred to figures – Osborne used the word “Billion” many times – that didn’t feature particularly on twitter.

So can we draw any useful insight from the relative word frequencies? If there is a difference between the message sent and the message received, it’s that people* resonate more when it comes to changes in duty and cuts than they do when it comes to business and figures (even if they are in the billions). No surprise there then.

*more accurately… people who tweet about budgets

Share this:

I’ve known Alan Bradburne (@alanb) and Matt Mower (@sandbags) for a couple of years now. We’ve often put the world to rights over coffee, but until now, I’ve never had the chance to work with either of them. Thinking that our skills might complement each other, we agreed that we’d hack something together as an experiment… and this is the result…

You know that getting feedback from your friends / peers / random-people-from-the-internet will help you

You record a video of you doing your thing (be it a 30 second elevator speech, a product demo, a song & dance… whatever) and upload it to Youtube (or maybe you point people towards someone else’s thing)

You ask friends to help (or whoever you want) by visiting the tubeinsight website.

They watch your video, while moving a slider up and down to indicate how much they like or dislike what they see as your video plays. Effectively they highlight the bits where you do well, and the bits where you don’t do so well. If you want to have a go now at giving some feedback click here

TubeInsight records and aggregates the real-time feedback from everyone.

You go to your results page on tubeinsight. There you can watch your video with an animated graph overlay which shows everyones feedback. To see an example click here

At this point (hopefully) you’ve learnt something. You can use your new-found knowledge however you like!

Obviously its a rough prototype. Maybe Like/Dislike is the wrong question to ask? Maybe the interface isn’t intuitive enough? Maybe people should be able to just review a small portion of a video etc etc.

However, even though it’s pretty basic right now, we know from the initial feedback we’ve had that there are lots of directions it could go in (if any jump out at you then do feel free to tell us, we’d love to know!)

If you know of any communities of people where real-time anonymous feedback of the sort of activity that can be video’s (no – let’s not go there) is valuable, then we’d love to talk to them – put us in touch.

Share this:

For a side project I’m doing, I needed to be able to find out the historical position (as a latitude/longitude) of the International Space Station. Given the number of ISS tracker sites available, I’d hoped there would be an API somewhere for it. However, after much searching, I couldn’t find a single one (Wolfram Alpha’s website will give you the info, but you can’t get at the info using their API and even if you could, their terms don’t let you store the data).

Given that I needed to build something to calculate the information, I thought I may as well also publish it as a freely available API – hopefully it may save someone some work.

How it works

NORAD publishes data for earth orbiting objects which you can use to calculate their positions. The data comes in the form of TLE’s which (if you sign up for an account) you can retrieve from an api at space-track.org (if you are old-school you can use a Nasa JPL telnet interface to query their database) . Once you have a TLE you can calculate positions from it using a public domain algorithm. Each TLE is only accurate for a point in time – so as you get further away from that time, your prediction will be further out. (around 3km’s error after 24 hours) For this reason the TLE’s are published several times a day.

The api works by maintaining a database of all of the published TLE’s for the ISS since late 1998 up until the present time. When you make a request the api finds the nearest valid TLE and then uses that to make its calculations. Thankfully the astro-physics number-crunching side of things is handled by a library .

API details

Note: the mechanism that pulls the TLE information for this API stopped working in July 2014 – I haven’t had time to fix it, so positions are accurate before that date, but not after

You can access the api as follows…

http://jimanning.com/issapi/?unixts=1359548643

…where unixts is a Unix Epoch time in seconds. if you omit the unixts then it will return the current position of the ISS. If you specify a time in the future, it will still make a calculation, but it won’t be accurate.

Share this:

BBC Question Time has become one of those TV programmes that I now rarely watch without also reading and interacting with the #bbcqt hashtag on twitter. Clearly I’m not alone – last night there were approximately 36,000 tweets on the hashtag over the hour or so that the programme was on. That’s a lot of data about a TV programme – and given the programme’s political nature there must be some really interesting information in there about politicians and the way people react to what they say.

Last night I captured every tweet using the #bbcqt hashtag that was made between 10.30pm and 11.45pm (the programme runs for an hour from 10.35pm) from the twitter api (with this volume of tweets you need to be sneaky to avoid crashing into the api limits… but it’s possible)

Before the programme I wrote a quick bit of code so that during the show I could capture which person was speaking when.

Afterwards I put together some code to….

divide the tweets up into ones that were obviously about the panellists and ones that were just generic and then further divide them up into one-minute chunks

remove all of the rubbish bits (punctuation, inconsequential words etc) from each tweet

With the data cleaned up and analysed I then coded up a front end to display the information (for the technical people, it uses D3.js and rickshaw.js for the graphing library).

Good things

I like how you can clearly see how twitter reacts just after someone has spoken – obvious really – but nice to see the data doing what you would expect it to.

There are some interesting points where clearly one of the panellists has struck a chord on a particular topic – more positive sentiment than negative after particular comments.

Things to Improve

The classification is trained on some generic good word/bad word data – I reckon a much more accurate sentiment would be gained by training the classifier on actual #bbcqt data (especially as there’s some quite choice anglo-saxon swearing that the current classifier doesn’t recognise)

I gave up, because I didn’t have time, but theres some really interesting information in analysing word frequencies within the tweets – maybe one to develop later

What’s next

I’m interested to find out if there is an appetite for this kind of (very niche I know) analysis – do Political parties monitor this stuff ?- is there some valuable feedback in there for them?

Share this:

A little over three years ago, I co-founded a company called SocialOptic. It’s a fantastic company, with some great products, but I’ve decided the time is right to move on.

Throughout my career I have always worked on project-based things – things with a start and an end – from building news production software at the BBC to creating a new project services team in an Oil & Gas company – from rescuing failing projects to writing business cases for future technology investments. It’s always been about starting from a concept; being creative; defining the why, who, what, when and how much; getting people on side and finances approved; managing a team building something new and then handing that on to an operations team to run with.

About three and a half years ago I started mulling over some ideas I’d had for a Project Management software product. I knew that it would be a useful tool, and could also see that no-one had built it yet… So I left my job (at the time I was contracting as a Projects and Programme Manager), rented a house in the foothills of the Sierra Nevada mountains in southern Spain, and went on an extended family holiday. In between family stuff and walks in the mountains, I taught myself to write code. It was quite a challenge, but by the end of our three months in Spain a very basic version of the product was ready and soon it was up and running and available to use on the web. I called it Milestone Planner and started to watch the sign up stats with interest.

My original plan had been to go back to contracting when we returned, but while we were away the economy tanked and pretty much every company I’d worked with before had put all new projects on hold. No new projects, no need for Project Managers. In the meantime I had met Benjamin (my soon-to-be business partner) and together we plotted and worked out how we might turn Milestone Planner from a prototype product into a fully fledged business.

We took the plunge and incorporated SocialOptic Ltd… and what a ride it’s been.

Over the last three years we have built two products from the ground up. We’ve had the satisfaction of seeing users become customers and watching them put our software right at the centre of their own project processes. I’m fantastically proud of Milestone Planner and what we have achieved. As the number of customers increases, the business is moving into a new phase, one where the key activities need to be focussed on operational and support matters rather than building *new stuff* . I’m not an operations person and never will be, so have decided it’s the right time for me to move on. It’s tough when a co-founder leaves a business, but we’ve been able to structure things so that Benjamin can continue to run the ship, and steer SocialOptic towards the solid operational success that I’m confident it will become.

So… if you know anyone who’s looking for a creative professional, who can speak business and code and has a track record of getting things off the ground… I’m available (here’s my linkedin profile for a potted career history)

Share this:

The first computer I touched was an Apple IIe. My Dad had bought one for his accountancy practice and one evening, after everyone had gone home, he took me to see it. It was amazing – we played breakout, made Lemonade, and calculated our Biorhythms – Believe me, for a ten year old in 1980 that was pretty impressive. With the computer there were a bunch of manuals, but the one that caught my eye was titled “Apple II Basic Programming Manual“. I ‘borrowed’ it – and in its pages I discovered a brand new world.

Thirty one years later and I wake up to the news that Steve Jobs is dead. However, for me, its not Apple’s obvious achievements that come to mind. I don’t think about the mac I’m reading the news on, or the Pixar films the kids watch, or the iPhone in my pocket. What comes to mind is a single paragraph on page 12 of that manual, still remembered three decades later.

“There is nothing you can do by typing at the keyboard that can cause any damage to the computer. Unless you type with a hammer. So feel free to experiment. With your fingers.”

And having been given permission to experiment…. thats exactly what I did. Reflecting on it now, I think that paragraph probably changed my life.

Share this:

The fact that the current UK ICT curriculum is pants has been a much discussed topic within the tech community for a long time. It’s focussed on consumption not creation, it ignores younger children and even if I were being charitable I’d say, at its best, it is preparing our kids for the kind of jobs they might have found in the office of ten years ago.

In the last week this topic has had some limelight after Eric Schmidt’s talk at the Edinburgh International Television Festival, prompting the mainstream media to write about the subject.

Of course the big question is… what do we do about it?

Whilst its right to shout loudly about the inadequacies of the current curriculum (can we call it the ‘legacy’ curriculum?), it is the easy option. To be credible we need to propose a solution – put up, or shut up.

Earlier today I mooted on twitter that we need an open-source alternative ICT curriculum – its not an original idea by any means and I know its been talked about before. Quite rightly Emma Mulqueeny (@hubmum) responded with a “theres been talk, but whats needed, whats the first step?” challenge.

I have absolutely no experience of building a curriculum so I might be talking complete rubbish, but here are my starter-for-ten thoughts – they are unpolished and completely up for comment etc.

1 – Find the people who are at the intersection between….

Caring deeply about this stuff

Knowing what works and doesn’t work in the classroom

Writing a curriculum that would be credible in the eyes of whoever it is that judges whether a curriculum passes muster or not

2 – Break it down into something small

Theres no point in sitting in a room for years word-smithing an overarching curriculum (I’m guessing thats how we ended up with what we have now). I’m far more in the ‘whats the fastest experiment we can do to see if this has legs?’ camp. So I guess the question is do we take a narrow part of the curriculum and develop something for all ages… or do we take one age group and develop a fuller curriculum for that? or is there a better way to slice it?

3 – organise it like an open-source software project

In the sense of having a public repository (git-hub or similar) and a means to incorporate contributions / changes / bugfixes – I’m quite taken by the idea of a bug fix to a curriculum

4- find the fastest way to test out the first iteration

There are already plenty of enlightened teachers who do ‘get it’ – are they able to go ‘off-curriculum’ ? (I know thats easy for me to say and probably very difficult in practice). Would they be able to translate the objectives of the curriculum into free available lesson plans that could be more widely tried out?

5 – Learn from the above and repeat until we get it right

…Incomplete thoughts and rough round the edges I know, but is this (or a refinement of this) a workable way forward?