Phone Number + Facebook Profile = Targeted Insight

The relaxation of WhatsApp interaction with Facebook was always going to happen, you’d have to be pretty naive to think it wasn’t. Even with the encrypted messages that merely skims the surface of messaging as an advertising medium. WhatApp have your number at signup and if you join that with your phone number on your Facebook profile…. well you’ve got the advertising segments, likes, friends, events, movements, checkins and so on. WhatsApp just joined the big old advertising platform graph.

And if you think that the WhatsApp is generated, which it is, remember that Facebook own the whole shebang and will easily be able to generate the same. Even when you mangle it through a MD5 digest it’s still an id of sorts. Perform the same digest on the phone number of your Facebook profile you’ll get a match.

In the previous posts on Onyx (you can read part 1 and part 2 on how to setup a Kafka Streaming application with Onyx), there are a couple of things worth noting that I didn’t mention in the original articles.

Blocking Jobs

When you submit your job you are submitting it to Zookeeper, then when the peers are running it sees what jobs are available to execute. The template code’s -main method does a good job of separating these actions out.

This line:

(onyx.test-helper/feedback-exception! peer-config job-id)

is looking out for exception messages on the job. If all is well it will block any form of exiting and wait for someone to kill it. Removing the line will effectively submit the job to Zookeeper, exit out and that’s that. When the peer is running it will then pick up the job. In our Kafka job the peer will pick up the job and then do it’s Kafka input job. Kill the peer and start it again, the job should pick up from the last offset.

The Greedy and the Balanced

The default template works on a greedy job scheduler, meaning that if you run ten peers and the job demands three, then the remaining seven will be blocked from use.

Fine when there’s only one job submitted but a pain when there’s two or more. In the config.edn file it’s a case of changing the job scheduler from greedy to balanced. The balanced scheduler will keep the remaining available peers and allocate them to the other jobs as they come on.

Routes are 90% marketing and the small matter of staff, an aeroplane and permission to do the route. And while my suspicion that the Derry to Dublin route was going to get shelved goes way back before a Brexit vote, Brexit is the very reason used as to why it’s not going ahead. I’ve yet to see one shred of evidence to support that.

Was It Ever Going To Happen?

Well there was a subsidy application in late 2015. Citywing operating with aircraft from Van Air Europe, Citywing don’t own aircraft they are classed as a “virtual airline” where they take the bookings and Van Air Europe hold the routes, the air operator’s certificate and provide the planes, crew, maintenance and insurance.

So the real negotiation has nothing to do with Citywing and everything to do with Van Air Europe. Digging around for that info though is hard work.

Yes, it was in the works to happen though I feel the press might have jumped on it a wee bit too soon. Nothing at that point was set in stone.

City of Derry Needs Good News Stories

I’ve posted before about City of Derry Airport and what I think needed to happen to revive it’s place in Northern Ireland as a viable route and destination. And the shelving of the original route to Dublin in 2011 (operated by British Airways but provided by Logan Air) did not help matters. It stopped for the reason many routes stopped, the subsidy ran out.

It’s no surprise CoD pushed the PR boat out. The headlines about the route starting in April 2016 filtered to the press and everyone in Co. Londonderry smiled. Thing is these things are never straightforward.

While there is evidence of the subsidy application happening I’ve yet to see any firm response or confirmation of a subsidy. Chances are it gets hard to find that info as there’s due diligence and legals that you’d not be putting online.

Bad News, Blame It On Brexit

It’s now in fashion, there’ll be a four page Vogue special feature soon, to blame things not happening on Brexit. For CoD to blame Brexit on this, well I don’t buy it. Things started to seep out slowing in March…

All that time though Citywing were pushing the LDY > DUB route on their website, you just couldn’t book anything. So there was probably another reason. There were three potential reasons I believe, first of all Citywing realised the route wasn’t commercially viable, especially if the subsidy wasn’t going to happen. So they spent a lot of time mulling over on what to do. At this point, we still don’t know if the subsidy was offered or not.

Secondly, when your airline has four aircraft then route planning becomes a logistical operation. It’s just not Northern Ireland, it’s the Isle of Man, it’s Cardiff and all the other route combinations. It’s a fun maths problem, but that’s not what I’m going to do now.

Thirdly and most importantly, you need pilots to operate plans….. Van Air were on the look out as late as May.

While Van Air don’t say the routes the pilot would be operating, Van Air also fly out of Belfast City Airport, there’s that good chance it was to fulfil new routes.

City of Derry cited “technical problems” in the Mark Patterson Facebook thread.

Technical problems are usually down to the wings falling off the plane or you can’t find someone to fly it. I’m going for the latter.

Until I see some actual evidence of Brexit being a problem for this route then I shall remain skeptical about the reasons.

The Story So Far….. And Beyond

In Part 1 I covered that basic setting up of the Onyx platform configuration, starting Zookeeper and deploying the peer and a basic job. In this post I’m going to plug in the Kafka components into the code base and setup a three broker cluster on a local machine. Then we’ll kick everything off and send some messages for processing.

Code Amendments To Our Application.

Before we dive into code a little bit of planning needs to be thought about. First of all how many peers we need to run. At present my workflow is pretty simple:

:in :inc :out

A core.async channel for the input, a function to increment the value passed in and then an output channel (again via core.async). I’m going to modify this so we have a Kafka topic being read (though the :in channel), I’ll amend the math.clj code for ease of understanding so it just shows to deserialised message as a Clojure map. The output channel I’ll leave as is.

My Kafka cluster has three brokers and one partition. If a broker dies then another will be elected the lead and the data stream doesn’t suffer. Onyx operates on a one peer per partition, if you allocate too many peers against it then it will throw errors so I’m making that clear now.

So for every message stream that Onyx reads it will pass through the deserialiser (as we configured in the :in workflow. All that’s left to do is change the original math.clj file to print the map to the console.

Amending The Process Function

Let’s keep this really simple. It just prints the deserialised map to the console.

Setting Up a Three Node Kafka Cluster

Kafka runs nicely as single node cluster but I want to use three brokers to give it some real world exposure. The config directory has a file called server.properties. I’m going to copy that for the other two brokers I want.

Create a new topic

While Kafka can create new topics on the fly when messages are sent to them Onyx doesn’t always behave when the job is submitted but the topic isn’t there. So it’s safer to create the topic ahead of time.

I’m Going To Park That There….

Okay that was a long gig for anyone. We’ve covered integrating the Onyx Kafka plugin to a project, amending the code to allow it to consume Kafka messages from a topic and deserialise the to a Clojure map.

Next time, when I get to it. I’ll do something funky with the map and give it some real world use case.