The last post was the preamble, now it’s time to get some actual code down. And it turns out I’ve already kicked a hornet’s nest of opinion before I’ve written a line of Clojure. Seems that programmer’s shouldn’t be let near mathematical notation.

Mad Math, Fury Code

Let’s have a look at the scary math bit of The Birchbox Problem again.

Breaking Down The Problem

It’s always easier to break down a problem in to smaller chunks, well I believe so but I could come unstuck but it’s worth a shot. On the surface this looks pretty imposing but hopefully by breaking it down it will become easier. First of all there’s three distinct algorithms at play here. Looking back at the Birchbox post I see that M is a set of products and N is a set of customers.

There’s a few sightings of Sigma notation:

This basically is add all of them together. I wrote about that a short while ago. There is a gotcha with this algorithm though but more of that later.

Max, well that means maximum so there’s something else. The max of B. Okay, so far so good. I’ll tackle the matrix things H() and B() later. First I need some products and some customers to get things started.

Creating Products

Let’s create a vector of maps, each map is a product with an id, name and stock quantity.

Doesn’t feel like much but it’s a starting point. The next job is to put the next parts of the algorithm in. The main thing is, we’ve got started….

….though it’s all totally subject to change.

Calculating Happiness

The function is a bit of a domain knowledge quagmire for us as it’s going to be within the realms of Birchbox how it’s actually calculated. The happiness of a customer based on the product it’s presented with. The result is gathered and calculated from many sources such as customer reviews, star ratings, web site visits to product pages and even social media interaction for example.

This is called an objective function which is:

“ the result of an attempt to express a business goal in mathematical terms for use in decision analysis, operations research or optimization studies“

Which leaves everything a bit open ended but I can create a Clojure function to mock some form of happiness output. A random number will do for the time being while we get things defined. Then we can put some boundary to say anything greater than 75 means the customer is happy with the product.

birchbox.maths.problem> (rand 101)
;; => 69.79507413967906

Becomes…

(defn calc-happiness []
(rand 101))

Binary Assignments

Nice thing with binary it’s going to be 0 or 1. So the binary assignment (B) is basically saying is the product going in to the customer’s box, yes or no. That seems dependent on the result of the happiness rating.

R & D is Fun

I’m going park this part here. I’ve managed to navigate some ideas of what’s going on and got some code down. Whether it’s right or wrong well it’ll take a few more iterations to get a proper handle of what’s been coded is going to be any use or not. Saying that, it’s been fun.

This set of blog posts (no idea how many, we’ll see how it goes) is to attempt to solve a couple of things that have been going on in my head. Firstly, there seems to be a lack of material that shows you how the language of maths is transferred to the language of code. With more and more tech news outlets constantly harping on that data scientists “need a PhD” and “at least a maths degree” it puts some programmers at a worrying disadvantage.

Now, I’ve said in the past data science is a team, there aren’t many individuals that can easily cover skill wise what a data scientist actually is. I don’t have a hoverboard for a start….. If you look at the MastodonC team for example our backgrounds are varied but collectively it’s very agile indeed.

Not Knowing Mathematical Language Is My Personal Hangup

Yes this is a personal crusade but one that will help in other departments later on I hope. It’s become an interest along with statistics to battle through and figure this stuff out. And if I can learn a programming language then there should not be any reason why I can’t learn a mathematical one too.

The best way to start learning is to look at an existing problem, so I need to find a problem.

The Birchbox Problem

The Birchbox Problem is a question that the data science team at Birchbox had to solve, being: “The Birchbox Problem is the assignment of products to subscribers such that we maximize happiness.”

Their blog post, back from 2012, gives the outline of their solution in joyous mathematical notation.

So that gives us an outline of what we need to do to solve the problem. Great if you understand it and totally confusing if you don’t. I’ll keep referring back to this while working on these blog posts.

The Code

The end game is that we have some code that satisfies that statements above. Okay it won’t be 100% how Birchbox do things but that’s not the point, the point is we learn something about transferring math to code. That’s the real challenge.

I’ve created a Github repository and I’m working in Clojure. If you want to fork it and do a pull request by all means do, always happy to read and learn. It’s a public repository too so there’s at least a little emphasis on me making some effort of progress to aid my learning.

This has been on my mind for a good few months and I’ve pondered whether to write it, or just leave it….. ah what the heck. As usually these are merely opinions so take it for what you will.

The Deities Sell The Dream

OH: “Huh, it’s like Zuckerberg is the god of startups…..”

There’s the gospel according to Zuckerberg, the gospel according to Jobs, the gospel according to Bezos. The same companies are used as deity markers in the likes of the tech press. And the most dangerous part is that these are the ones many of have, or still do, look up to. And we start talking like we’re in their image….

“It’s like the Facebook of Dog Owners”

“Its like Instagram but for Cross Stitch”

“It’s like Match.com for Hermaphrodites”

The truth is you and I will never be like them, simple as that, we won’t, our brains aren’t wired like theirs.

And the Valley is like Mount Shasta, there’s a lot of new age deities out there willing to share their dreams, visions and end goals, and they really can’t wait to see you go with them on that journey. And a lot of the time we look at the Valley as the single source of nirvana…. it’s a dangerous thought.

Accelerators as Churches?

There’s been a lot of church growth recently. Accelerators will gather you together and teach you their ways. Sounds a bit like church to me…. listen to our doctrine on “how to be a great startup” and we’ll show you where the riches are. The church is designed by the human race for human consumption, and there are times we need those places (I did, not for a startup, the other one).

As the gatherings followed other look and think, “I want some of that too” and more and more open up, willing to take in any startup that will listen to their preachings. Then the money men and big business get involved and they startup accelerators with the aim of helping (let’s be honest, probably owning) your dreams and visions.

Do yourself a favour and look at the leadership of this church and ask yourself this question: “How do they know this stuff they’re talking about? Did they walk this path or just read about it?”

Anyone preaching the Church of Lean and the Church of Business Model Generation, well they need not preach. You can go to your local Waterstones and buy the books there. Save yourself a lot time, nonsense and pain. There are other paths than those too.

Look at your leaders, do they really have your best intentions at heart, probably not. There are a few, but very few, people I know that are happy to advise without any gain on their part. They’re called friends, they’ll be the most honest with you. Accelerators are promising all sorts and usually want to own part of your soul, it’s like the Church meets Robert Johnston and, when you think about it, that’s a lethal combination.

The Prophets

“a prophet is an individual who is claimed to have been contacted by the supernatural or the divine, and to speak for them, serving as an intermediary with humanity, delivering this newfound knowledge from the supernatural entity to other people. The message that the prophet conveys is called a prophecy.”

There are plenty of voices telling us about the “next big thing” and their shout pieces come through the foghorn of the tech press. Every year it’s the same “What will be big in startup space for 2016”, ultimately even the oracle Gartner only has a rough idea.

Ultimately no one really knows.

A lot of faith is put in the prophets of startup, their guide in the path to supposed riches. With a voice many follow and create things based on their potentially dodgy knowledge. And this is where the blanket terms come out…. messaging apps, BigData which has now led to AI and Deep Learning. “Make these products and you will be guided to the riches…..”. Yeah, right, sure.

The Evangelists

Oh a special place in my heart for these ones because I’ve seen the damage caused in both church and start.

I’ve been to a Benny Hinn crusade, 17,000 people on the inside of the arena (and it was a BIG arena in the UK) and another 5,000 outside. It’s the only time I’ve been truly scared as I thought they would actually break the doors down. There were police helicopters, the works. It was also a long time ago, my life and beliefs are my private business (as in, that’s my thing and I don’t discuss it with anyone really).

And as startups some put the same faith in the travelling wordsmiths, the ones who can weave a story and sell it to us like the water we need. Ultimately what happens to us as individuals is irrelevant once the evangelist leaves as long as the money is in the bucket. No aftercare, no after thought.

The startup event industry is booming, big and spreading. They only care for startups in the sense of you’ll buy the tickets, you’ll attend, you’ll listen and you may (potentially) be mentally manipulated in such a way you’ll come out a different founder. Once the money is in the bucket though it’s job done and on to the next event. It could be a pitch event (pay to pitch, really?!), or a “network” event where you know most of the people in the room, a tip, if you know 75% of the room already then you’re probably wasting your time.

Blessed are the money makers and the events they run.

So Where Does It End?

Well it doesn’t and if you think about the amount of startups out there, does it end in the Rapture where the 144,000 startups are suddenly picked up and taken to the promised land? Possibly but that does beg the question about the other 98% who invested everything in going to startup church….. well you could start over and be a born again startup.

Living in York for the majority of my life you kind of get used to flooding, it happens frequently. With the number of floods increasing and their chances of it being a real nasty one also increasing, it might be a good idea to start peeking at past data and seeing if there’s anything jumping out at us.

In NI We Don’t Need Open Weather Data….

Getting Historical Met Office Data

While Cecelia can with 98% accuracy tell the biblical level of weather for tomorrow and the day after that, historical data is another matter. That’s where open data comes in rather handy. And those Met Office folks opened up a lot of data in 2011 on the data.gov.uk portal.

Fine if you want one day but rather a pain if you want to get data between two dates. Now you could do some handy unix scripting to pull the data in but there is an issue with that. When you fire the search form the data is pulled and then forwarded to another url with a unique id. So unless you’re great at handling redirects within unix it’s going to be a hard slog to get the data out…… well, it was, now I have good news for you.

A Clojure Alternative?

MastodonC created an open source project called kixi.hecuba.weather, the primary purpose is to pull historical weather data and send it up to the Hecuba platform. It’s fully open source so even if you’re not using Hecuba you can still use some of the component parts of k.h.w. to pull the Met Office data for your needs.

The project is hosted on github and anyone can use this project for grabbing the Met Office data.

Clone The Project

First of all you need to clone the project. From the command line run the following git command from your terminal or command prompt.

The namespace where the Met Office functions are is called “kixi.hecuba.weather.metoffice-api“, so you will need to change namespace first.

kixi.hecuba.weather.core=> (ns kixi.hecuba.weather.metoffice-api)

Now we can turn our attention to retrieving the Met Office data.

Retrieving The Data

The run-data-pull function takes three parameters, a start date, end date and a path where to save the files to. For example if I’m wanting to pull historical data from the 1st of January 2013 to the 1st of February 2013 I would run the following. The dates are entered as a dd/mm/yyyy format, so for example: