Custom Estimators

The tfestimators framework makes it easy to construct and build machine
learning models via its high-level Estimator API. Estimator
offers classes you can instantiate to quickly configure common model types such
as regressors and classifiers.

But what if none of the predefined model types meets your needs?
Perhaps you need more granular control over model configuration, such as
the ability to customize the loss function used for optimization, or specify
different activation functions for each neural network layer. Or maybe you’re
implementing a ranking or recommendation system, and neither a classifier nor a
regressor is appropriate for generating predictions. The figure on the right
illustrates the basic components of an estimator. Users can implement
custom behaviors and or architecture inside the model_fn of the estimator.

This tutorial covers how to create your own Estimator using the building
blocks provided in tfestimators package, which will predict the ages of
abalones based on their physical
measurements. You’ll learn how to do the following:

An Abalone Age Predictor

It’s possible to estimate the age of an abalone (sea snail) by the number of
rings on its shell. However, because this task requires cutting, staining, and
viewing the shell under a microscope, it’s desirable to find other
measurements that can predict age.

model_fn: A function object that contains all the aforementioned logic
to support training, evaluation, and prediction. You are responsible for
implementing that functionality. The next section, Constructing the
model_fn covers creating a model function in detail.

params: An optional dict of hyperparameters (e.g., learning rate,
dropout) that will be passed into the model_fn.

Note: Just like tfestimators’ predefined regressors and classifiers, the
estimator initializer also accepts the general configuration arguments
model_dir and config.

For the abalone age predictor, the model will accept one hyperparameter:
learning rate. Here, learning_rate is set to 0.001, but you can tune this value as
needed to achieve the best results during model training.

The following code creates the list model_params
containing the learning rate and instantiates the Estimator:

features: A dict containing the features passed to the model via
input_fn.

labels: A Tensor containing the labels passed to the model via
input_fn. Will be empty for predict() calls, as these are the values the
model will infer.

mode: One of the following mode_keys() string values
indicating the context in which the model_fn was invoked:

"train" The model_fn was invoked in training mode, namely via a train() call.

"eval". The model_fn was invoked in evaluation mode, namely via an evaluate() call.

"infer". The model_fn was invoked in predict mode, namely via a predict() call.

model_fn may also accept a params argument containing a dict of
hyperparameters used for training (as shown in the skeleton above) and a config that represents the configurations used in a model, including GPU percentage, cluster information, etc.

The body of the function performs the following tasks (described in detail in the
sections that follow):

Configuring the model—here, for the abalone predictor, this will be a neural
network.

Defining the loss function used to calculate how closely the model’s
predictions match the target values.

Defining the training operation that specifies the optimizer algorithm to
minimize the loss values calculated by the loss function.

The model_fn must return an estimator_spec object, which contains the following values:

mode (required). The mode in which the model was run. Typically, you will
return the mode argument of the model_fn here.

predictions (required in infer mode). A dict that maps key names of
your choice to Tensors containing the predictions from the model, e.g.:

In infer mode, the dict that you return in estimator_spec will then be
returned by predict(), so you can construct it in the format in which
you’d like to consume it.

loss (required in eval and train modes). A Tensor containing a scalar
loss value: the output of the model’s loss function (discussed in more depth
later in Defining loss for the model) calculated over all
the input examples. This is used in train mode for error handling and
logging, and is automatically included as a metric in eval mode.

train_op (required only in train mode). An Op that runs one step of
training.

eval_metric_ops (optional). A dict of name/value pairs specifying the
metrics that will be calculated when the model runs in eval mode. The name
is a label of your choice for the metric, and the value is the result of
your metric calculation. The tf$metrics
module provides predefined functions for a variety of common metrics. The
following eval_metric_ops contains an "accuracy" metric calculated using
tf$metrics$accuracy:

If you do not specify eval_metric_ops, only loss will be calculated
during evaluation.

Configuring a neural network with feature_column and layers

Constructing a neural
network entails
creating and connecting the input layer, the hidden layers, and the output
layer.

The input layer is a series of nodes (one for each feature in the model) that
will accept the feature data that is passed to the model_fn in the
features argument. If features contains an n-dimensional Tensor with
all your feature data, then it can serve as the input layer. If features
contains a dict of feature columns passed to the model via an input function,
you can convert it to an input-layer Tensor with the input_layer function:

features. A mapping from string keys to the Tensors containing the
corresponding feature data. This is exactly what is passed to the model_fn
in the features argument.

feature_columns. A list of all the FeatureColumns in the model — age,
height, and weight in the above example.

The input layer of the neural network then must be connected to one or more
hidden layers via an activation
function that performs a
nonlinear transformation on the data from the previous layer. The last hidden
layer is then connected to the output layer, the final layer in the model.
tf$layers provides the tf$layers$dense function for constructing fully
connected layers. The activation is controlled by the activation argument.
Some options to pass to the activation argument are:

tf$nn$relu. The following code creates a layer of units nodes fully
connected to the previous layer input_layer with a ReLU activation
function:

The above code creates the neural network layer output_layer, which is
fully connected to second_hidden_layer with a sigmoid activation function
tf$sigmoid.

The network contains two hidden layers, each with 10 nodes and a ReLU
activation function. The output layer contains no activation function, and is
tf$reshape to a one-dimensional tensor to capture the model’s predictions,
which are stored in predictions_dict.

Defining loss for the model

The estimator_spec returned by the model_fn must contain loss: a Tensor
representing the loss value, which quantifies how well the model’s predictions
reflect the label values during training and evaluation runs. The tf$losses
module provides convenience functions for calculating loss using a variety of
metrics, including:

Supplementary metrics for evaluation can be added to an eval_metric_ops dict.
The following code defines an rmse metric, which calculates the root mean
squared error for the model predictions. Note that the labels tensor is cast
to a float64 type to match the data type of the predictions tensor, which
will contain real values:

Defining the training op for the model

The training op defines the optimization algorithm TensorFlow will use when
fitting the model to the training data. Typically when training, the goal is
to minimize loss. A simple way to create the training op is to instantiate a
tf$train$Optimizer subclass and call the minimize method.

The following code defines a training op for the abalone model_fn using the
loss value calculated in Defining Loss for the Model, the
learning rate passed to the function in params, and the gradient descent
optimizer. For global_step, the convenience function
tf$train$get_global_step takes care of generating an integer variable:

The complete abalone model_fn

Here’s the final, complete model_fn for the abalone age predictor. The
following code configures the neural network; defines loss and the training op;
and returns a estimator_spec object containing mode, predictions_dict, loss,
and train_op: