Deploying TensorFlow Models

Overview

While TensorFlow models are typically defined and trained using R or Python code, it is possible to deploy TensorFlow models in a wide variety of environments without any runtime dependency on R or Python:

CloudML is a managed cloud service that serves TensorFlow models using a REST interface.

RStudio Connect provides support for serving models using the same REST API as CloudML, but on a server within your own organization.

TensorFlow models can also be deployed to mobile and embedded devices including iOS and Android mobile phones and Raspberry Pi computers.

The R interface to TensorFlow includes a variety of tools designed to make exporting and serving TensorFlow models straightforward. The basic process for deploying TensorFlow models from R is as follows:

Getting Started

To demonstrate the basics, we’ll walk through an end-to-end example that trains a Keras model with the MNIST dataset, exports the saved model, and then serves the exported model locally for predictions with a REST API. After that we’ll describe in more depth the specific requirements and various options associated with exporting models. Finally, we’ll cover the various deployment options and provide links to additional documentation.

MNIST Model

We’ll use a Keras model that recognizes handwritten digits from the MNIST dataset as an example. MNIST consists of 28 x 28 grayscale images of handwritten digits like these:

The dataset also includes labels for each image. For example, the labels for the above images are 5, 0, 4, and 1.

Using the Exported Model

The REST API for the model is served at http://localhost:8989. Because we specified the browse = TRUE parameter, a webpage that describes the REST interface to the model is also displayed. The REST interface is based on the CloudML predict request API.

The model can be used for prediction by making HTTP POST requests. The body of the request should contain instances of data to generate predictions for. The HTTP response will provide the model’s predictions. The data in the request body should be pre-processed and formatted in the same way as the original training data (e.g. feature scaling and normalization, pixel transformations for images, etc.).

For MNIST, the request body could be a JSON file containing one or more pre-processed images:

Similar to R’s predict function, the response includes an array representing the digits 0-9. The image in new_image.json is predicted to be a 7 (since that’s the column which has a 1, whereas the other columns have values approximating zero).

Deploying the Model

Once you are satisifed with local testing, the next step is to deploy the model so others can use it. There are a number of available options for this including TensorFlow Serving, CloudML, and RStudio Connect. For example, to deploy the saved model to CloudML we could use the cloudml package:

Model Export

TensorFlow SavedModel defines a language-neutral format to save machine-learned models that is recoverable and hermetic. It enables higher-level systems and tools to produce, consume and transform TensorFlow models.

The export_savedmodel() function creates a SavedModel from a model trained using the keras, tfestimators, or tensorflow R packages. There are subtle differences in how this works in practice depending on the package you are using.

keras

The Keras Example above includes complete example code for creating and using SavedModel instances from Keras so we won’t repeat all of those details here.

To export a TensorFlow SavedModel from a Keras model, simply call the export_savedmodel() function on any Keras model:

Note the message that is printed: exporting a Keras model requires setting the Keras “learning phase” to 0. In practice, this means that after calling export_savedmodelyou can not continue to train models in the same R session.

It is important to assign reasonable names to the the first and last layers. For example, in the model code above we named the first layer “image” and the last layer “prediction”.

Each instance of new data should be formatted as a json array, and each element in the array should be a named array corresponding to the feature columns. This structure is similar to a named list in R.

The response is the predicted MPG:

{
"predictions": [
{
"predictions": [
8.4974
]
}
]
}

tensorflow

The tensorflow package provides a lower-level interface to the TensorFlow API. You can also use the export_savedmodel() function to export models created with this API, however you need to provide some additional parmaeters indicating which tensors represent the inputs and outputs for your model.

For example, here’s an MNIST model using the core TensorFlow API along with the requisite call to export_savedmodel():

Once the model is exported, the same process of using serve_savedmodel can be used, and the same HTTP requests demonstrated in the Keras example can be used against the tensorflow model.

Model Deployment

There are a variety of ways to deploy a TensorFlow SavedModel, each of which are described below. Of the 3 methods described, 2 of them (CloudML and RStudio Connect) share the same REST interface that we have been using with serve_savedmodel to test locally. The REST interface is described in detail here: https://cloud.google.com/ml-engine/docs/v1/predict-request.

CloudML

You can deploy TensorFlow SavedModels to Google’s CloudML service using functions from the cloudml package. For example:

Once deployed to CloudML, predictions can be made using the same REST interace we previously used locally. The HTTP POST requests will be similar to the sample requests, but CloudML additiobnally requires proper authorization.

See the Deploying Models article on the CloudML package website for additional details.

RStudio Connect

RStudio Connect is a publishing platform for applications, reports, and APIs created with R. Connect runs on-premise or in your own cloud infrastructure, giving you full control of the deployment environment. Connect can also integrate with your own security services and user management tools.

An upcoming version of RStudio Connect will include support for hosting TensorFlow SavedModels, using the same REST interface as is supported by the local server and CloudML.

Exported models will be published to Connect using the rsconnect package, for example:

If you would like to preview the feature, or get more information, contact sales@rstudio.com.

TensorFlow Serving

TensorFlow Serving is an open-source library and server implementation that allows you to serve TensorFlow SavedModels using a gRPC interface as opposed to the REST interface offered by the previous deployment tools.