In our previous tutorial, we learned how to use models which were trained for Image Classification on the ILSVRC data. In this tutorial, we will discuss how to use those models as a Feature Extractor and train a new model for a different classification task.

Suppose you want to make a household robot which can cook food. The first step would be to identify different vegetables. We will try to build a model which identifies Tomato, Watermelon, and Pumpkin for this tutorial. In the previous tutorial, we saw the pre-trained models were not able to identify them because these categories were not learned by the models.

Transfer Learning vs Fine-tuning

The pre-trained models are trained on very large scale image classification problems. The convolutional layers act as feature extractor and the fully connected layers act as Classifiers.

Since these models are very large and have seen a huge number of images, they tend to learn very good, discriminative features. We can either use the convolutional layers merely as a feature extractor or we can tweak the already trained convolutional layers to suit our problem at hand. The former approach is known as Transfer Learning and the latter as Fine-tuning.

As a rule of thumb, when we have a small training set and our problem is similar to the task for which the pre-trained models were trained, we can use transfer learning. If we have enough data, we can try and tweak the convolutional layers so that they learn more robust features relevant to our problem. You can get a detailed overview of Fine-tuning and transfer learning here. We will discuss Transfer Learning in Keras in this post.

ImageNet Jargon

ImageNet is based upon WordNet which groups words into sets of synonyms (synsets). Each synset is assigned a “wnid” ( Wordnet ID ). Note that in a general category, there can be many subcategories and each of them will belong to a different synset. For example Working Dog ( sysnet = n02103406 ), Guide Dog ( sysnet = n02109150 ), and Police Dog ( synset = n02106854 ) are three different synsets.

The wnid’s of the 3 object classes we are considering are given below

n07734017 -> Tomato

n07735510 -> Pumpkin

n07756951 -> WaterMelon

Download and prepare Data

For downloading Imagenet images by wnid, there is a nice code repository written by Tzuta Lin which is available on Github. You can use this to download images of a specific “wnid”. You can visit the github page and follow the instructions to download the images for any of the wnid’s.

However, If you are just starting out and do not want to download full size images, you can use another python library available through pip – imagenetscraper. It is easy to use and also provides resizing options. Installation and usage instructions are provided below. Note that it works with python3 only.

I found that the data is very noisy, i.e. there is a lot of clutter, the objects are occluded etc. So, I shortlisted around 250 images for each class. We need to create two directories namely “train” and “validation” so that we can use the Keras functions for loading images in batches.

Load the pre-trained model

In the above code, we load the VGG Model along with the ImageNet weights similar to our previous tutorial. There is, however, one change – include_top=False. We have not loaded the last two fully connected layers which act as the classifier. We are just loading the convolutional layers. It should be noted that the last layer has a shape of 7 x 7 x 512.

Extract Features

The data is divided into 80:20 ratio and kept in separate train and validation folders. Each folder should contain 3 folders belonging to he respective classes. You can change the directory according to your system.

Then we use model.predict() function to pass the image through the network which gives us a 7 x 7 x 512 dimensional Tensor. We reshape the Tensor into a vector. Similarly, we find the validation_features.

References

Subscribe & Download Code

If you liked this article and would like to download code and example images used in this post, please subscribe to our newsletter. You will also receive a free Computer Vision Resource Guide. In our newsletter, we share OpenCV tutorials and examples written in C++/Python, and Computer Vision and Machine Learning algorithms and news.

Join Course

Resources

Disclaimer

This site is not affiliated with OpenCV.org

I am an entrepreneur with a love for Computer Vision and Machine Learning with a dozen years of experience (and a Ph.D.) in the field.

In 2007, right after finishing my Ph.D., I co-founded TAAZ Inc. with my advisor Dr. David Kriegman and Kevin Barnes. The scalability, and robustness of our computer vision and machine learning algorithms have been put to rigorous test by more than 100M users who have tried our products. Read More…