TensorFlow

TensorFlow Image Classification: Three Quick Tutorials

TensorFlow can help you build neural network models to classify images. Commonly, these will be Convolutional Neural Networks (CNN). TensorFlow is a powerful framework that lets you define, customize and tune many types of CNN architectures. MissingLink’s deep learning platform provides an additional layer for tracking and managing TensorFlow projects.

A typical CNN process in TensorFlow looks like this:

Following is a typical process to perform TensorFlow image classification:

Scaling Up Image Classification on TensorFlow with MissingLink

If you’re working on image classification, you probably have a large dataset and need to run your experiments on several machines. This can become challenging, and you might find yourself spending serious time setting up machines, copying data and troubleshooting.

MissingLink is a deep learning platform that lets you effortlessly scale TensorFlow image classification models across many machines, either on-premise or in the cloud. It also helps you manage large data sets, manage multiple experiments, and view hyperparameters and metrics across your entire team on one pane of glass.

Modern image recognition models use millions of parameters. Training them from scratch demands labeled training data and hundreds of GPU-hours or more of computer power. Transfer learning provides a shortcut, letting you use a piece of a model that has been trained on a similar task and reusing it in a new model.

Here, we will reuse the feature extraction abilities from image classifies trained on ImageNet, and train an additional classification layer. We will use the image feature extraction module trained on ImageNet.

Prerequisites: Install tensorflow-hub, and a recent version of TensorFlow.

This script trains a new classifier on top and loads the pre-trained module for the flower photos. The flower types were not in the initial ImageNet classes the network trained on.

2. Bottlenecks

The initial phases analyze the images on disk and caches and calculate their bottleneck values. ‘Bottleneck’ refers to the layer before the final output layer. The final retraining succeeds in new classes because the type of information required to distinguish between all the 1,000 classes in ImageNet is also useful when distinguishing between new types of objects.

Every image is reused many times during training so you can cache these bottleneck values on disk. By default, they are kept in the /tmp/bottleneck directory.

3. Training

Training the top layer of the network starts after the bottlenecks are complete. You will see step outputs, training accuracy, validation accuracy, and cross entropy values.

This script will run 4,000 training steps. Each step selects ten images randomly from the training set, identifies their bottlenecks from the cache, and directs them into the final layer to generate predictions. Predictions are compared to the actual labels to update the weights of the final layer via the back-propagation process (see our in-depth guide on backpropagation).

Accuracy improves as the process evolves. After all the steps are complete, a final test accuracy evaluation is conducted on a separate series of images.

4. Using the Retrained Model

The script will write the model trained on your categories to:

/tmp/output_graph.pb

And a text file with the labels to:

/tmp/output_labels.txt

The model includes the TF-Hub module inlined into it and the classification layer. The two files are in a format that the C++ and Python image classification example can read.

You replaced the top layer, so you need to create a new name in the script, for example using the flag --output_layer=final_result if you’re using label_image.

Here’s an example of how to run the label_image example with the retrained model. TensorFlow Hub modules accept inputs with color values in the range [0,1], so there is no need to set --input_mean or --input_std flags.

You should see flower labels listed, typically with a daisy on top. You can substitute the --image parameter with your own images.

5. Training on your own categories

Once the script works successfully on the flower example images, you can teach your network to recognize other categories.

Quick Tutorial #2: Classifying Dog Images with ResNet-50

ResNet is an ultra-deep CNN structure that can run up to thousands of convolution layers. ResNet-50 is a specific variant that creates 50 convolutional layers, each processing successively smaller features of the source images.

By the end of this quick tutorial #2, you will have created code that will accept an input image and return an estimation of the breed of a dog. If a human face is identified, the algorithm will estimate the dog breed that resembles the face.

The following steps are summarized, see the full tutorial by Hamza Bendemra.

1. Setting up the building blocks for the algorithm

To create our algorithm, we will use TensorFlow, the OpenCV computer vision library and Keras, a front-end API for TensorFlow.

2. Detecting if an image contains a human face

To see if the image is a human face, we will use an OpenCV Face Detection algorithm. First, convert the images to grayscale. The detectMultiScale function executes the classifier stored in face_cascade and takes the grayscale image as a parameter.

import cv2
import matplotlib.pyplot as plt
%matplotlib inline

The following lines of code extract a pre-trained face detector and provide the value “True” if the function identifies a face.

For the final prediction, we get an integer that relates to the predicted object class of the model by taking the argmax of the predicted probability vector, which we can recognize with an object category via the ImageNet labels dictionary.

4. Build your CNN classifier with transfer learning

To minimize training time and retain accuracy, we will be training a CNN using transfer learning. By retaining the early layers and training newly added layers, we can use the knowledge acquired by the pre-trained algorithm. Keras has several pre-trained deep learning models used for prediction, fine-tuning and feature extraction.

5. Model Architecture

We will create our model architecture so that the last convolutional output of ResNET50 becomes input in our model. Add a Global Average Pooling layer. Also, add a Fully Connected Layer that has one note for each dog category and has a Softmax activation function.

6. Compile and test the model

Use the CNN to test how accurately it identifies breed in our test dataset. Fine-tune the model by going through 20 iterations.

Quick Tutorial #3: Classifying Flower Images with Google Inception

This tutorial shows how to classify a database of 7,000 flower images using Google Inception. Inception is an image classifier which Google built and outsourced. It was trained on a staggering 1.2 million images from a thousand different categories for two weeks at a time on some of the fastest machines in the world. Inception’s architecture is shown below.

The following tutorial steps are summarized, see the full tutorial by Amitabha Dey.

1. Download training images and scripts

Begin by downloading the training images for your classifier. These will consist of the images that you require your classifier to recognize. Keep them labeled in separate folders, as the folder_names are judged as the label for the photos they hold.

For this example, download images of 5 kinds of flowers with over 7000 images for each kind. Download images here.

Clone the project’s GitHub repository. Copy the flower_photos folder with your training images in the tf_files folder of the repository.

2. Retrain the network

Train the final layer of our network. The following directory retains the cache of all the bottleneck values:

You will get a readout of all the categories with their confidence score. The above shows that the test_image is a daisy with ~99% confidence.

TensorFlow Image Classification in the Real World

In this article, we explained the basics of image classification with TensorFlow and provided three tutorials from the community, which show how to perform classification with transfer learning, ResNet-50 and Google Inception. When you start working on real-life CNN projects to classify large image datasets, you’ll run into some practical challenges:

Tracking Experiments

Tracking experiment source code, configuration, and hyperparameters. There are many CNN architectures and you’ll need to discover which one suits your needs, and fine tune it for your specific dataset. You’ll probably run hundreds or thousands of experiments to discover the right hyperparameters. Organizing, tracking and sharing data for all those experiments is difficult.

Scaling up your experiments

Image classification models are computationally intensive, and you’ll need to scale experiments across multiple machines and GPUs. Provisioning those machines, whether you have to install on-premise machines or set up machine instances in the cloud, and ensuring the right experiments run on each machine, takes serious time.

Manage training data

Image and video classification projects typically involve large and sometimes huge datasets. Copying these datasets to each training machine, then re-copying it when you change project or fine tune the training examples, is time-consuming and error-prone.

MissingLink is a deep learning platform that does all of this for you, and lets you concentrate on building the most accurate model. Learn more to see how easy it is.