Tuesday, 31 January 2017

Announcing Zipkin Collector for Azure EventHub

If you are reading this, you have probably heard of Zipkin. If not, please take my word to leave this post to spend 10 minutes reading up on it - a very worthwhile 10 minutes which will introduce to you one of the best, yet simplest distributed tracing systems. It one word, it tells you where the time to serve requests been most spent helping you to optimise your Microservice architecture.

Zipkin, used by the likes of Twitter and Netflix, has already created a complete storm in the Java/JVM ecosystem, but many of us in the .NET community have not heard of it - and that is frankly a real pity. And if you have heard it and want to use it, yes of course we can try to port the whole system over to .NET but that would be a huge amount of work and frankly a waste since Zipkin is designed to work across different stacks as long as you can somehow get your data over to it. The data is normally pushed to Kafka, and Zipkin consume messages from Kafka by a component called Collector. Data then gets stored in a storage (currently available for MySQL, Cassandra or Elasticsearch) and then served by the UI.

Of course nothing stops you to run Kafka in your cloud or on-premise environment, but if you have never done it, to say the least, ZooKeeper (a consensus required for running Kafka) is not the easiest service to operate. And frankly if you are on Azure it makes a lot of sense to use EventHub, an Azure PaaS service with functionality very similar to Kafka. Sadly there were no collector for it.

I have been very keen to bring Zipkin to ASOS, but could not quite justify running ZK and Kafka, even for myself. Hence I felt something has to be done about it. The only problem: had never done a Java/Maven project before.

* * *

I have been doing what I have been doing - being a professional developer - for some time now. And I have had my ups and downs, both moments that I am proud of and moments of embarrassment because I have messed up. But never, have I just picked up a complete different stack, and built something like what I am going to share, within a couple of weeks. [Yeah I am talking about Zipkin Collector for Azure EventHub]

This really has been a testament to how pluggable and nicely designed-Zipkin is, and above all it has a truly amazing community - championed by Adrian Cole. Help was always around the corner, be it on hardcore stuff such as how to modularise collector or my noob problems with Maven.

Not to forget too, that Azure EventHub SDK basically made it completely trivial to implement a working solution. All the heavy lifting has been done by the EventProcessorHost so all is left is a bit of plumbing to get the configuration over to these components.

* * *

How to use EventHub Collector

So the idea is that you would run zipkin-server (which hosts the Zipkin UI) and in the same process you run your collector. Zipkin uses Spring Boot's auto configuration mechanism to load the right collector based on the configurations provided. The project is host on github. [UPDATE: Project has moved to OpenZipkin organisation here]

EventHub Collector gets triggered by the existence of "zipkin.collector.eventhub.eventHubConnectionString" configuration via command line. Rest of the configurations necessary can be passed by an application.properties or application.yaml file.

2- Unpackage MODULE jar into an empty folder

copy zipkin-collector-eventhub-autoconfig-x.x.x-SNAPSHOT-module.jar (that has been package in the target folder) into an empty folder and unpackage

jar xf zipkin-collector-eventhub-autoconfig-0.1.0-SNAPSHOT-module.jar

You may then delete the jar itself.

3- Download zipkin-server jar

Download the latest zipkin-server jar (which is named zipkin.jar) from here. For more information visit zipkin-server homepage.

4- create an application.properties file for configuration next to the zipkin.jar file

Populate the configuration - make sure the resources (Azure Storage, EventHub, etc) exist. Only storageConnectionString is mandatory the rest are optional and must be used only to override the defaults:

zipkin.collector.eventhub.storageConnectionString=<azure storage connection string>zipkin.collector.eventhub.eventHubName=<name of the eventhub, default is zipkin>zipkin.collector.eventhub.consumerGroupName=<name of the consumer group, default is $Default>zipkin.collector.eventhub.storageContainerName=<name of the storage container, default is zipkin>zipkin.collector.eventhub.processorHostName=<name of the processor host, default is a randomly generated GUID>zipkin.collector.eventhub.storageBlobPrefix=<the path within container where blobs are created for partition lease, processorHostName>

5- Run the server along with the collector

Assuming zipkin.jar and application.properties are in the current working directory, run this from the command line (note that the connection string to the eventhub itself is passed in the command line):

21 comments:

Maybe you could connect LinkerD (which is a "Service Mesh"/HTTP proxy you use for inter-service communication which does monitoring/metrics) to Zipkin and avoid having to instrument each service individually?

Very Informative and helpful.Thank you.Hope to read more from you.For quality courses on anything IT related give myTectra a try.myTectra is the Marketing Leader In Banglore Which won Awards on 2015, 2016, 2017 for best training in Bangalore: python interview questions

WELL VERSED AND UN FLAWEDvery helpfull blog it was a pleasure reading your blogwould love to read it more knowldege is not found but earned through hardwork and good teaching that being said click here to join us the next best thing in bangalore devops online training Devops Training in Bangalore