Machine learning is not new and TensorFlow isn’t new either. By open-sourcing the tech, Google has brought the ML domain within the grasp of software engineers. Google is refining TensorFlow’s toolchain continuously but the barrier to entry for developers is still high. Not only the domain of machine learning is tough to wrap a beginner’s head around, TensorFlow itself presents a relatively high upfront complexity for engineers trying out the tech, not to mention building the tools will try hard one’s patience. TensorFlow Serving is meant to address deploying to production environments but the tooling is similarly complex to navigate.
This tutorial is am attempt to shorten the time it takes to deploy a (pre-trained) TensorFlow image recognition model in a web application built around Spring Boot. It circumvents the entire Python and Docker-centered installation requirements and instead uses a pre-trained image recognition model published by Google and the official Tensorflow for Java library to load the model and classify new images.

Run the following in your Terminal/bash (you need to have both git and JDK8 installed):git clone https://github.com/florind/inception-serving-sb.git
./gradlew fetchInceptionFrozenModel bootrun

Now browse to http://localhost:8080 and upload an image. The model will classify the image and will report what it has identified in the image alongside a certainty percentage. You should see something like this:

The rest of this post is a dive into details on how such feat was possible and a peek into some of the tech behind TensorFlow.

What just happened?

The code loads a TensorFlow frozen model that has been previously built. This is the eventual output of the so-called fitting and inference (or training&evaluation), a process that takes a computational graph, training and evaluation data, runs the training data through this graph that uses slowly-changing loss functions (gradients) to identify the combination of variables that yield the best results on the training data then runs the resulting model on the evaluation data to check on the accuracy. This trial and error training&evaluation process repeats for a predefined number of steps until a final graph is produced. It is computationally intensive and this is where GPUs provide significant acceleration. This model can then be exported (serialized) and later used to classify (or infer) new data. In our case it will try to identify new images based on what it has learned from the training data.

Google has open-sourced the exporting process as well through a project called TensorFlow Serving. So to start using learned models you just need to have an exported model handy then feed new data on it for classification.

This is exactly what we just did. The pre-trained model I use in this project is called the Inception Model, a high quality image recognition model (again) open-sourced by Google and his graph is trained to recognize amongst 1000 objects including catamarans like the one above. The build script has a step that downloads and unpacks it: ./gradlew fetchInceptionFrozenModel

The project code then loads this graph during application startup and classifies new images based on this loaded graph. The output is the top inference (code here) although the graph computes many more but with decreasing level of probability.

Pre-trained frozen models

An important detail is that the TensorFlow exported model has to be frozen. That means that all variables are replaced by constants and thus “frozen” in the resulting graph which is generated in protobuf format. Along the process of freezing the inception graph, a text file containing the classification labels is also generated. This is important as it allows mapping label IDs that are computed at classification with actual label names (see the labels list lookup after classification here).

You also have to know how to query the graph as well. The tf.outputLayer attribute configured in my project here I’ve actually spotted here which took a while to figure out since I didn’t actually generate the frozen graph myself but took one prepared by the TensorFlow team with no documentation around it. code is good enough if you know what you’re looking for though but still time consuming. Querying the output tensor is also key. I’ve used this code to understand how the graph classifies an image and get the actual useful information (the category label indices).

You can get other models loaded using this code. I recommend either train & freeze graphs yourself (requires real ML skills) or use well documented frozen graphs to allow you to painlessly load and query them.

Retraining

The Inception model can identify one of the 1.000 images that it’s been trained with. Give this tool to kids and they’ll choose the character of the day. A few examples on funny fails:

This is why “retraining” exists. It takes an pre-trained model and allows adding new classifications. For the Inception model we can add more images to classify, like cartoon characters, celebrity faces etc. I’d really like to see a flourishing market growing around pre-trained&retrained models, it allows applications to leap directly into employing deep learning (computer vision, NLP etc.) without incurring the upfront cost of the ML itself.

Speed-up image detection

Here are a few tricks to increase the image recognition. It’s still a bit far from real-time computer vision but it can get pretty close with the right hardware.

Graph optimization : Which pre-trained graph is used, counts on performance. It takes more than one second to classify an image on my MacBook with Inception v3. It takes around 3 seconds when I tried Inception v4, frozen from this checkpoint file and the accuracy is not greatly improved I found. Google released MobileNet just last month, a model specifically designed for running on devices. You can also run graph inference optimization to reduce the the amount of computation when inferring. In my tests it boosted 5-10% the speed.

Compiling the TensorFlow JNI for your platform. The TensorFlow jarfile published in Maven is likely not optimized for your platform. Inference uses GPU if available and even if you don’t have the hardware, the native TensorFlow driver can make use of your machine’s hardware to accelerate computations. Warnings like this one The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.means you should compile the JNI driver for your platform. Howto available here. My tests show an inference speed-up of up to 20%!

Java vs. Python: My empirical tests have shown that running image recognition with the Python-driven TensorFlow Serving toolchain had similar performance with the Java counterpart (I ran the whole thing in Docker though).

Deploy in production

Build a Docker image using the supplied Dockerfile. It safely runs in Kubernetes as well. You can also package a self-contained fat jar using ./gradlew build. The resulting jarfile is in build/libs and you can execute it using the regular java -jar build/libs/inception-serving-sb .jar command.

Why Spring Boot?

Because Java devs can be hugely productive with it and effortlessly follow the best industry practices while at it. Because it takes a minute to build and deploy this webapp and have image detection out of the box. You’ll also notice that the integration test also outputs API documentation through the excellent Spring REST Docs project. Actually the integration test is loading the actual Inception graph and executes image recognition without mocking. The project is using the latest Spring Boot 2.0.0 which enables reactive programming although I’ve used none of that here yet.

Final thoughts

Deploying pre-trained models in production is not difficult although I’d like to see a streamlined way of publishing them beyond Google. To conceive and train any model is the bread and butter of machine learning, requiring expertise, computing power and time. A marketplace of production-ready pre-trained models will flourish only if they are easy to adopt in production.

The Software Engineering field is a liberal one with a relatively low barrier to entry. Formal education is valued less than hands-on experience but that tends to be difficult to accurately measure. The industry came up with various solutions to gauge a candidate’s skills, ranging from coding tests, whiteboard interview to showing their public work to asking the candidate for a work portfolio. Programming is also an art, isn’t it?

Having been on both sides of the interview table, I’ve been long preoccupied with finding the right fit for the job at stake. Here’s a simple method that helps me qualify the candidates’ skills by using the traditional technical industries distinction between technician and engineer.
Wikipedia defines a technician as follows:

A technician is a worker in a field of technology who is proficient in the relevant skills and techniques, with a relatively practical understanding of the theoretical principles.

Engineers design materials, structures, and systems while considering the limitations imposed by practicality, regulation, safety, and cost.
[…]
A professional engineer is competent by virtue of his/her fundamental education and training to apply the scientific method and outlook to the analysis and solution of engineering problems. He/she is able to assume personal responsibility for the development and application of engineering science and knowledge, notably in research, design, construction, manufacturing, superintending, managing and in the education of the engineer.

There is a major difference between the skills used by a electrician mounting or fixing your broken wall plug (who cares if it’s using AC and not DC?) and the skills needed by the engineer who designed it. Consequently, there is a similar difference between someone hacking together a website with Nginx and another who designs the browser that runs that website or the one devising the HTTP specification used by both the browser and the Nginx-driven website.

Hints how to distinguish a software technician from a software engineer:
– A technician will tend to use a restricted set of tools. An engineer knows what tool to use to get the job done: programming language choice, databases, frameworks, third party libraries, etc.
– An engineer’s work is articulate, symmetric and consistent while striving to address all other concerns like regulation, safety and cost (sic).
– An engineer follows the idioms of the language or framework she’s using. This goes beyond personal coding style and into leveraging language constructs and recognizable patterns that best fit the solution.
– Ask the five whys when inquiring about a technical detail. Engineers know how to explain their architectural, design as well as implementation choices. A technician will get a certain job done but will not be able to explain the design decisions behind the building blocks he’s using.
– An engineer’s work is long lasting with the architectural intent surviving refactorings and indeed guiding the evolution of a system.

The seniority scale can be applied to both trades. A Sr. Engineer masters the discipline by making the right architecture, design and implementation choices given both small and large scale systems. A senior technician will be able to quickly solve a difficult local problem with the tools they have but they recognize they need the help of an engineer to correctly design complex systems.

Lastly, both trades have virtues and technology companies need both skill sets just the same a hospital needs both nurses and doctors to function.

This has been a long time coming, five years actually. Spincloud went live in January 2009 and I’ve added the temperature heatmap overlay a few months later.
The heatmap and the corresponding map overlay is generated once an hour and it’s been faithfully doing so for six years. It does it by first generating an global temperature heatmap then cutting tiles (up to zoom level 6). Every hour.
But what I have also done back then was to also save one heat map image once a day. The idea was to generate a timelapse at some point, showing a visualization of the global temperatures over a longer span of time.
Now it’s time to show the result: below are 5 years of global temperatures in a one minute timelapse (make sure to switch to the HD version if not enabled by default):

* I started collecting this data in Aug.2009 but the lapse above starts in in July 2010 as I have a data gap between Dec.2009 and Jul.2010. Frankly I can’t remember why.

The timelapse neatly shows the SYNOP and METAR coverage globally. The weather stations behind these data sources report at various times of day and I’ve figured that 19:00 GMT is the hour in the day with the most reporting stations. Looks like this has changed about a year ago if you look at Russia’s remote locations that are not covered well anymore at this time of day. The rest of the landmass is reasonably well covered.

I think it’s interesting to explain how I have actually generated the temperature heat map. The code you’ll see is pre Java 8 since I haven’t upgraded the code much in the past years.

Spincloud uses several sources of weather data and it’s quite remarkable how little they have changed over the past 6 years. I witnessed some data sources going offline but the core of the data still comes from the same sources. For the heatmap I use the METAR and SYNOP global data, mostly coming from the NWS servers who are providing clean and reliable data for many years. The data is stored in the local database and it essentially contains a map point, a temperature and a timestamp.
With these sources at hand, the logic to generate a heatmap image is as follows:
1. Iterate over each pixel on the global map and get the respective temperature. Only include land masses.
2. Interpolate that temperature with the temperatures of the nearest locations where there are temperature readings
3. Generate an 2×2 pixel rectangular area with a color that corresponds with the interpolated temperature. Only fill land masses in order to make the overlay look realistic.
4. Append that image in memory to the global heat map image in the correct location
5. Repeat until complete then dump the generated image to disk

Turns out that at any the temperature data points are less than 20,000 globally and so I can add all of them in a list in memory. In step 1, when iterating over each map pixel this list is looked-up for temperature points in vicinity.
The map mask referenced in the code below, the worldmap-mask.png looks like this:

Currently there are 1774 useful images collected so far and still going. Two years back I have moved Spincloud to DigitalOcean (note: referral link) and kept collecting this historical data without a hitch.

To geneate the timelapse I used ffmpeg and this tutorial. You’ll notice the month-year embedded in the video, I have used used this howto to get them in and this to figure the subtitle format. There’s some code I wrote to generate the subtitles file to be in sync with the timelapse but it’s too boring to include.

As a side note, Spincloud runs a total of 8 background jobs collecting temperature and forecast data, generating temperature and radar tiles, and weather warning data, all on the cheapest DataOcean plan.

It’s been too long of a dry streak on my blog (four years!) but in the mean time I’ve been actively working on several projects. About 3,5 years ago I have reconnected with hardware engineering, a hobby of mine ever since I was a kid.
I am integrating GPS in one of my Arduino-compatible hardware projects and I’m using a Maestro Wireless part, called A5100 that notably comes with GLONASS support (the Russian constellation of GPS satellites).
There’s some pretty nice GPS support in the TinyGPS project but it lacks any of the advancements in the field such as NMEA v3.0 or support for additional constellations. This means that some good data is ignored when parsing NMEA sentences from devices such as the aforementioned A5100.

I have therefore worked on an update to TinyGPS to incorporate some missing support that I published here https://github.com/florind/TinyGPS. This is a drop-in replacement for TinyGPS and backwards compatible with any NMEA compatible GPS receiver integrated with your Arduino.

The full complement of details is in the Github documentation. Notably, this update adds GLONASS support and exposes some interesting data when not tracking:
– Date and time
– Satellites in view

With this data we can build GPS user interfaces containing more advanced data. Here’s one I’ve built on a small Sharp ePaper LCD model LS013B4DN04

The screen capture on the right shows the GPS device tracking. Satellites in view are still showing although only three are participating in the solution. Shown is also the GNS mode indicator, the “AN” string, indicating that only the GPS constellation is used in the solution in this moment.

If you wonder what the e210 is, it’s the horizontal dilution of precision, HDOP. Divide by 100 to get to the ranges specified on wikipedia.

A5100 and other modern GPS parts promise out of the box support for Galileo who will become operational in 2016 and BEIDOU, already operational in Asia.
As this update only adds GLONASS support, feel free to add support for others if they’re available in your area (BEIDOU for now).

So you have a great website idea and you want to build and bring that first version online as fast as you can. You figured that node.js is the way to go. You kick-off the development and after a couple of hours of hacking you realize that although you’re progressing at breakneck speed you’re missing a few important bits:

– How do I better structure my project?

– I want to test this thing. I want unit tests, UI (headless browser) tests and public API tests (I want that API offering out too of course)

– I want proper CSS and html templating

– Looks like I need non-trivial request routing, I need more than the default provided

Oh, and after you have all of this, you want to be able to deploy it to a node-ready cloud environment like Heroku without hassle.

Location Based Services are all the rage these days. The space is still being defined and the players are trying to differentiate their service offerings in order to attract the critical mass of developers. In this post I’ll draw a side-by-side comparison of the main features provided by the major Places API providers today. While I have no hard numbers to back-up the “major provider” claim, I’ll simply go for the web companies I would look for when building an application around Location services.

The features of all these APIs are designed primarily to support (and promote) the business use cases of each respective competitor. One notable exception is Yahoo’s GeoPlanet API which advertises itself as being a general purpose API for referencing places.

I won’t try to identify any “best” API in the end. This post is meant to allow the reader to make an informed decision on which API(s) to use.

A long list of companies including Twitter, Google, Foursquare, Gowalla, SimpleGeo, Loopt, and Citysearch are far along in creating separate

databases of places mapped to their geo-coordinates. These efforts at creating an underlying database of places are duplicative, and any competitive advantage any single company gets from being more comprehensive than the rest will be short-lived at best. It is time for an open database of places which all companies and developers can both contribute to and borrow from.

I agree that there is duplication of effort but this is what happens with many competitive technologies (look at now many online maps are available today). Each company tries to add a competitive advantage to its offering while providing the same core functionality as the competition.Update: I started this post back in April and a lot of developments recently only enforce: this point. (Check Facebook Places and Google Places for more info).

I like the idea of an open database of places. Any company could build value-added services on top of it and sell them while they are not concerned about issues that come with building and maintaining such database like geo-location/address accuracy and duplicate place resolution to name just a few. Techcrunch’s Schonfeld adds another issue: who can a place and who should be in control of it, suggesting that anybody can update the database and “the best data should prevail”. This is hard and suggests a wiki-like approach for better or worse.
I’m not a fan of centralizing such database. Since there are such great market forces at play, it may become a playground for fights (my data is better than yours), a committee will attempt to regulate it just to push it into oblivion while everybody will get their toys and go build their own database.

I have a different idea (and it’s not new either).

Businesses have a great deal of interest in such database. It puts them on the map. They don’t particularly care who is using their place as long as the data about their business is correct and their customers easily reach their venue. The experience with using a mobile routing software to get to a place in real world is the equivalent of not waiting more than four seconds for a webpage to load. It just has to route the customer precisely to a location.

Why not letting the business to own their own geo data? All it takes is for them to have a website and add a bit of information to it to allow for auto-discovery; it’s called geotagging. It’s the same idea that Matt Griffith had back in 2002 that allows RSS feed autodiscovery applied to geo. The real win is for small businesses that adopt geotagging. All they need to do is add a small bit of metadata on their homepage and let web indexers do the job of collecting this data. Oh, and it’s free.
This brings a double win: companies in the mapping business access accurate geo information about businesses. The business themselves are happy that their customers can precisely find their physical location by means of address and/or geo-coordinates. Moreover, the accuracy of the data is maintained by the businesses since they want their customers to find them even when they move. A Places database that aggregates this type of data can mark these places as “verified” since they come directly from merchants. It even provides more accurate means of building forward and reverse geocoding tools.
Going forward with this model, the competition will shift their efforts from building a database of places to adding value to a (more or less) common Places database like local promotions and building great mapping products to allow us, the customers to find them.

The hard part is promoting this model. If say, half of the brick and mortar small businesses with a web presence embed geo metadata on their website, then the big players take notice. How to get there is the real challenge.

Handling errors in a REST way is seemingly simple enough: upon requesting a resource, when an error occurs, a proper status code and a body that contains a parseable message and using the content-type of the request should be returned.
The default error pages in Tomcat are ugly. Not only they expose too much of the server internals, they are only HTML formatted and making them a poor choice if a RESTful web service is deployed in that Tomcat container. Substituting them to simple static pages is still no enough since I want a dynamic response containing error information.

Fetching, aggregating and transforming data for delivery is a seemingly complex task. Imagine a service that serves aggregated search results from Twitter, Google and Bing where the response has to be tailored for mobile and web. One has to fetch data from different sources, parse and compose the results then transform them into the right markup for delivery to a specific client platform.
To cook this I’ll need:
– a web server
– a nice way to aggregate web service responses (pipelining would be nice)
– a component to transform the raw aggregated representation into a tailored client response.

I could take a stab at it and use Apache/Tomcat, Java (using Apache HttpClient 4.0), a servlet dispatcher (Spring WebMVC) and Velocity templating but it sounds too complex.

Enter Node.js. It’s an event-based web server built on Google’s V8 engine. It’s fast and it’s scalable and you develop on it using the familiar Javascript.
While Nodejs is still new, the community has built a rich ecosystem of extensions (modules) that greatly ease the pain of using it. If you’re unfamiliar with the technology, check-out the Hello World example, it should get you started.
Back to the task at hand, here are the modules I’ll need:
– Restler to get me data.
– async to allow parallelizing requests for effective data fetching.
– Haml-js for view generation

Update Mar.04 Thanks to @ewolff some of the points described below are now official feature requests. One (SPR-6928) is actually scheduled in Spring 3.1 (cool!). I’ve updated the post and added all open tickets. Please vote!

This post is somewhat a response to InfoQ’s Comparison of Spring MVC and JAX-RS.
Recently I have completed a migration from a JAX-RS implementation of a web service to Spring 3.0 MVC annotation-based @Controllers. The aforementioned post on InfoQ was published a few days after my migration so I’m dumping below the list of problems I had, along with solutions.

Posts navigation

About

My name is Florin Duroiu and I’m a freelance software engineer and tech lead based in Berlin, Germany.
I am the author of the weather website Spincloud and the feedreader Newsplorer.Follow me on twitter