transformer_encoder (encoder-only) runs only the encoder for sequence to
class modeling. Example use case: sentiment classification.

The Transformer is just one of the models in the Tensor2Tensor library.
Tensor2Tensor (T2T) is a library of deep learning models and datasets as
well as a set of scripts that allow you to train the models and to download and
prepare the data.

Before you begin

Before starting this tutorial, check that your Cloud project is correctly set
up, and create a Compute Engine VM and a TPU resource.

In this tutorial, you manually setup VM and TPU instances, but T2T also
supports automatically creating those instances for you with its --cloud_tpu
flag. See T2T's Cloud TPU docs for more information.

--machine-type=n1-standard-4 is a standard machine
type with 4 virtual CPUs and 15 GB of memory. See
Machine Types for more
machine types.

--image-project=ml-images is a shared
collection of images that makes the tf-1.6 image
available for your use.

--image-family=tf-1-6 is an image with the
required pip package for TensorFlow.

--scopes=cloud-platform allows the VM to access
Cloud Platform APIs.

Create a new Cloud TPU resource. For this example, name the resource
demo-tpu. Keep in mind that billing begins as soon as the
TPU is created, until the time it is deleted. (Check the
Cloud TPU pricing page to
estimate your costs.) If you are using a dataset that requires a substantial
download and processing phase, hold off on running this command until you
are ready to use the TPU:

Connect to your VM

Alternatively, you can SSH into your Compute Engine VM from the
Google Cloud Platform Console: Go to Compute Engine -> VM instances. Find the
tpu-demo-vm instance in the list of instances, and click SSH to connect to
it.

Add disk space to your VM

T2T conveniently packages data generation for many common open-source datasets
in its t2t-datagen script. The script downloads the data, preprocess it, and
makes it ready for training. To do so, it needs local disk space.

You can skip this step if you run t2t-datagen on your local machine (pip
install tensor2tensor and then see the t2t-datagen command below).

where DATA_DIR is a location on Cloud Storage, and TMP_DIR is on your
local machine. In this example, TMP_DIR is a location on the disk that you
added to your Compute Engine VM at the start of the tutorial.

Use the t2t-datagen script to generate the training and
evaluation data on the Cloud Storage bucket, so that the Cloud TPU can access
the data:

You can view the data on Cloud Storage by going to the Google Cloud Platform Console and
choosing Storage from the left-hand menu. Click the name of the bucket that
you created for this tutorial. You should see sharded files named
translate_ende_wmt32k_packed-train and translate_ende_wmt32k_packed-dev.

Give your TPU access to the data

You need to give your TPU read/write access to Cloud Storage objects.
To do that, you must grant the required access to the service account used by
the TPU. Follow these steps to find the TPU service account and grant the necessary
access:

List your TPUs to find their names:

$ gcloud beta compute tpus list

Use the describe command to find the service account of your
TPU, where demo-tpu is the name of your TPU resource:

$ gcloud beta compute tpus describe demo-tpu

Copy the name of the TPU service account from the output of the
describe command. The name has the format of an email
address, like 12345-compute@my.serviceaccount.com.

The above command runs 10 training steps, then 3 evaluation steps. You can
(and should) increase the number of training steps by adjusting the
--train_steps flag. Translations usually begin to be reasonable after ~40k
steps. The model typically converges to its maximum quality after ~250k
steps.

View the output in your Cloud Storage bucket by going to the Google Cloud Platform Console
and choosing Storage from the left-hand menu. Click the name of the
bucket that you created for this tutorial. Within the bucket, navigate to
the training directory, for example, /training/transformer_ende_1, to
see the model output. You can launch tensorboard pointing at that
directory to see training and evaluation metrics.

Train a language model

You can use the transformer model for language modeling as well. Run the
following command to generate the training data:

Select the route that Google automatically created as part of the
Cloud TPU setup. The peering entry starts with
peering-route in the ID.

At the top of the Network Routes page, click Delete to delete the
selected route.

When you've finished finished examining the data, use the gsutil command to
delete any Cloud Storage buckets you created during this tutorial. (See the
Cloud Storage pricing guide for free storage limits and other
pricing information.) Replace my-bucket-name with the name of your
Cloud Storage bucket: