The big picture

As described in a previous article we (Niklas and I) are going to use Tensorflow to classify images into pre-trained categories. The previous artikel was about on how to train a model with Tensorflow on Kubernetes. This article here now describes how to use the pre trained model which is stored on Object Storage. Similar to the training we will also use docker to host our program but this time we will use OpenWhisk as a platform.

Like the first part I also use the Google training Tensorflow for Poets. This time not the code itself but I copied the important classification parts from their script into my python file.

OpenWhisk with Docker

OpenWhisk is the open source implementation of an so called serverless computing platform. It is hosted by apache and maintained by many companies. IBM offers OpenWhisk on their IBM cloud and for testing and even playing around with it it the use is for free. Beside python and javascript OpenWhisk also offers the possibility to run docker containers. Internally all python and javascript code is executed anyhow on docker containers. So we will use the same official Tensorflow docker container we used to build our training docker container.

Internally OpenWhisk has three stages for docker containers. When we register a new method the execution instruction is only stored in a database and as soon as the first call approaches OpenWhisk the docker container is pulled from the repository, then initialised by an REST call to ‘\init‘ and then executed by calling the REST interface ‘\run‘. The docker container keeps active and each time the method is called only the ‘\run‘ part is executed. After some time of inactivity the container is destroyed and needs to be called with ‘\init‘ again. After even more time of inactivity even the image is removed and need to be pulled again.

The setup

The code itself is stored on github. Let’s have a look first on how we build the Docker container:

Dockerfile

Shell

1

2

3

4

5

6

7

8

9

FROM tensorflow/tensorflow:1.4.0-py3

WORKDIR/tensorflow

COPY requirements.txtrequirements.txt

RUN pip install-rrequirements.txt

COPY classifier.pyclassifier.py

CMD python-uclassifier.py

As you can see this Docker is now really simple. It basically installs the python requirements to access the SWIFT Object Store and starts the python program. The python program keeps running until the OpenWhisk system decides the stop the container.

We make heavy use of the idea of having a init and a run part in the execute code. So the python program has two main parts. The first on is init and the second run. Let’ have a look the init part first which is basically setting up the stage for the classification itself.

\init

Python

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

@app.route('/init',methods=['POST'])

definit():

try:

message=flask.request.get_json(force=True,silent=True)

ifmessage andnotisinstance(message,dict):

flask.abort(404)

conn=Connection(key='xxxxx',

authurl='https://identity.open.softlayer.com/v3',

auth_version='3',

os_options={"project_id":'xxxxxx',

"user_id":'xxxxxx',

"region_name":'dallas'}

)

obj=conn.get_object("tensorflow","retrained_graph.pb")

graph_def=tf.GraphDef()

graph_def.ParseFromString(obj[1])

withgraph.as_default():

tf.import_graph_def(graph_def)

obj=conn.get_object("tensorflow","retrained_labels.txt")

foriinobj[1].decode("utf-8").split():

labels.append(i)

exceptExceptionase:

print("Error in downloading content")

print(e)

response=flask.jsonify({'error downloading models':e})

response.status_code=512

return('OK',200)

Unfortunately it is not so easy to configure the init part in a dynamic way with parameters from outside. So for this demo we need to build the Object Store credentials in our source code. Doesn’t feel right but for a demo it is ok. In a later article I will describe how to change the flow and inject the parameters in a dynamic way. So what are we doing here?

10-16 is setting up a connection to the Object Store as described here.

18-22 is reading the pre trained Tensorflow graph directly into memory. tf is a global variable

24-26 is reading the labels which are basically a string of names separated by line breaks. The labels are in the same order as the categories in the graph

By doing all this in the init part we only need to do it once and the run part can concentrate on classifying the images without doing any time consuming loading any more.

Tensorflow image manipulation and classification

Python

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

defrun():

deferror():

response=flask.jsonify({'error':'The action did not receive a dictionary as an argument.'})

How to get the image

The image is transferred base64 encoded as part of the Line 24-25 request. Part of the dictionary is the key payload. I choose this because Node-red is using the same name for some kind of most important key. Tensorflow has a function to consume base64 encoded data as well but I could not get it to run with the image encoding I use. So I took the little extra step here and write the image on file and read it back later. By directly consuming it I think we could same some milliseconds processing time.

Transfer the image

Line 27 reads the image back from file

Line 29 decode the jpeg into an internal representation format

Line 30 cast the values to an float32 array

Line 31 adds a new dimension on the beginning of the array

Line 32 resizes the image to 224, 244 to have a similar size with the training data

Line 33 normalize the image values

Classify the image

Line 34-35 gets the input and output layer and stores it in the variables

Line 36 loads the image into Tensorflow

Line 39 here is the magic happening. Tensorflow processes the CNN with the input and output layer connected and consumes the Tensorflow image. Furthermore numpy is squeezing out all array nesting to a single array.

Line 40 has an array with probabilities for each category.

Mapp the result to labels

The missing last step is now to map the label names to the results which is be done in line 43 and 44.

The first execution will take up to 15 seconds because the docker container will be pulled from docker hub and the graph will be loaded from the Object Store. Calls later should be around 150 milliseconds processing time. The parameter –result will force OpenWhisk to wait for the function to end and also show you the result on your command line.

1

2

3

4

5

6

7

{

"daisy":0.9998985528945923,

"dandelion":0.00007187054143287241,

"roses":4.515387388437375E-7,

"sunflowers":0.000029122467822162434,

"tulips":4.63972159303605E-11

}

If you want to get the log file and also an exact execution time try this command:

First call results in “duration”: 3805. Your call itself took way longer in the first call because 3805 is only the execution of the docker container (including init) not the time it tooks OpenWhisk to pull the docker container from docker hub.

Being an developer advocate means to play always with the latest version of tools and being on the edge. But installed programs are getting out of date and so I always end up with having installed old versions of CLI tools. One reason why I love cloud (aka other people’s computers) computing so much is because I don’t need to update the software, it is done by professionals. In order to have always the latest version of my Bluemix CLI tools in hand and being authenticated I compiled a little docker container with my favourite command line tools. cf, bx, docker and wsk.

Getting the docker container

I published the docker container on the official docker hub. So getting it is very easy when the docker tools are installed. This command will download the latest version of the container and therefore the latest version of installed cli tools. We need to run this command from time to time to make sure the latest version is available on our computer.

Shell

1

docker pull ansi/bluemixcli

Get the necessary parameters

For all command line tools we need username, passwords and IDs. Obviously we can not hardcode them into the docker container therefore we need to pass them along as command line parameters when starting the docker container.

Username (the same as we use to login to Bluemix)

Password (the same as we use to login to Bluemix)

Org (The Organisation we want to work in, must already be existing)

Space (The Space we want to work in, must already be created)

AccountID (This can we catched from the URL when we open “Manage Organisation and click on the account)

Run the container

The container can be started with docker run and passing all parameters with -e in:

Shell

1

2

3

4

5

6

7

8

9

docker run-it--rm\

-eBX_USERNAME=<Bluemix Username>\

-eBX_PASSWORD=<Bluemix Password>\

-eBX_ORG=<Bluemix Organisation>\

-eBX_SPACE=<Bluemix Space>\

-eBX_ACCOUNT_ID=<Bluemix Account ID>\

-eWSK_AUTH=<OpenwhiskAuthentification>\

-v${PWD}:/root/host\

ansi/bluemixcli/bin/bash

Line 8 mounts the local directory inside the docker container under /root/host. So we can fire up the container and have a bash with the latest tools and our source code available.

Use the tools

Before we can use the tools we need to configure them and authenticate against Bluemix. The script “init.sh” which is located in “/root/” (our working directory) takes care of all logins and authentications.

cf

The Cloudfoundry command line tool for starting, stopping apps and connecting services.

bx

The Bluemix version of the Cloudfoundry command line tool. Including the plugin for container maintenance. By initializing this plugin we also get the credentials and settings for the docker client to use Bluemix as a docker daemon.

docker

The normal docker client with Bluemix as daemon configured.

wsk

The OpenWhisk client already authenticated.

We can configure an alias in our .bashrc so by just typing “bxdev” we will have bash with the latest cli tools available.