TensorFlow Lite is the official solution for running machine learning models on mobile and embedded devices. It enables on‑device machine learning inference with low latency and a small binary size on Android, iOS, Raspberry Pi and etc. TensorFlow Lite uses many techniques for this such as quantized kernels that allow smaller and faster (fixed-point math) models.

We will optimize the SSD Lite MobileNet v2 model for a proper comparison.

You can skip the next two parts by using the provided docker image with TensorFlow 1.9 and object detection API pre-installed:

1

docker pull c1emenza/tf-1.9-object-detection-api:v1

Installing Tensorflow

If you don’t have TensorFlow installed on a host machine, install TensorFlow. You can use the official instruction. Or install TensorFlow from source using Bazel following the instructions here. Also, pay attention to the TF version, because TF is not backward compatible. And if build model with 1.11 version, then the model may not work on 1.9 version. So better use 1.9 version for TF Lite optimization.

Installing TensorFlow Object Detection

If you are not familiar with TensorFlow Object Detection, welcome! To install it, you can follow the instructions from the official git repository.

Convert a model with TensorFlow Lite

You can skip this part too because we’ve made a pre-trained model available here.

To make these commands easier to run, let’s set up some environment variables:

1

2

3

export CONFIG_FILE=PATH_TO_BE_CONFIGURED/pipeline.config

export CHECKPOINT_PATH=PATH_TO_BE_CONFIGURED/model.ckpt

export OUTPUT_DIR=/tmp/tflite

We start with a checkpoint and get a TensorFlow frozen graph with compatible ops that we can use with TensorFlow Lite. So, you already have to install TensorFlow and Object Detection python libraries. Then to get the frozen graph, run the export_tflite_ssd_graph.py script from the models/research directory with this command:

1

2

3

4

5

object_detection/export_tflite_ssd_graph.py\

--pipeline_config_path=$CONFIG_FILE\

--trained_checkpoint_prefix=$CHECKPOINT_PATH\

--output_directory=$OUTPUT_DIR\

--add_postprocessing_op=true

In the /tmp/tflite directory, you should now see two files: tflite_graph.pb and tflite_graph.pbtxt. Note that the add_postprocessing flag enables the model to take advantage of a custom optimized detection post-processing operation which can be thought of as a replacement for tf.image.non_max_suppression. Make sure not to confuse export_tflite_ssd_graph with export_inference_graph in the same directory. Both scripts output frozen graphs: export_tflite_ssd_graph will output the frozen graph that we can input to TensorFlow Lite directly and is the one we’ll be using.

Next we’ll use TensorFlow Lite to get the optimized model by using TOCO, the TensorFlow Lite Optimizing Converter. This will convert the resulting frozen graph (tflite_graph.pb) to the TensorFlow Lite flatbuffer format (detect.tflite) via the following command. or a floating point model, run this from the tensorflow/ directory: