To do this, we will build a Cat/Dog image classifier using a deep learning algorithm called convolutional neural network (CNN) and a Kaggle dataset.

The first part covers some core concepts behind deep learning, while the second part is structured in a hands-on tutorial format.

In the first part of the hands-on tutorial (section 4), we will build a Cat/Dog image classifier using a convolutional neural network from scratch.

In the second part of the tutorial (section 5), we will cover an advanced technique for training convolutional neural networks called transfer learning.

By the end of this post, you will understand how convolutional neural networks work, and you will get familiar with the steps and the code for building these networks.

Our goal is to build a machine learning algorithm capable of detecting the correct animal (cat or dog) in new unseen images.

Classification using a machine learning algorithm has 2 phases: The training phase for an image classification problem has 2 main steps: In the predicition phase, we apply the same feature extraction process to the new images and we pass the features to the trained machine learning algorithm to predict the label.

The promise of deep learning is more accurate machine learning algorithms compared to traditional machine learning with less or no feature engineering.

In addition to algorithmic innovations, the increase in computing capabilities using GPUs and the collection of larger datasets are all factors that helped in the recent surge of deep learning.

basic model for how the neurons work goes as follows: Each synapse has a strength that is learnable and control the strength of influence of one neuron on another.

If the final sum is above a certain threshold, the neuron get fired, sending a spike along its axon.[1] Artificial neurons are inspired by biological neurons, and try to formulate the model explained above in a computational form.

An artificial neuron has a finite number of inputs with weights associated to them, and an activation function (also called transfer function).

We need 2 elements to train an artificial neural network: Once we have the 2 elements above, we train the ANN using an algorithm called backpropagation together with gradient descent (or one of its derivatives).

CNNs have special layers called convolutional layers and pooling layers that allow the network to encode certain images properties.

This layer consists of a set of learnable filters that we slide over the image spatially, computing dot products between the entries of the filter and the input image.

For example, if we want to apply a filter of size 5x5 to a colored image of size 32x32, then the filter should have depth 3 (5x5x3) to cover all 3 color channels (Red, Green, Blue) of the image.

The goal of the pooling layer is to progressively reduce the spatial size of the representation to reduce the amount of parameters and computation in the network, and hence to also control overfitting.

A pooling layer of size 2x2 with stride of 2 shrinks the input image to a 1/4 of its original size.

[2] The simplest architecture of a convolutional neural networks starts with an input layer (images) followed by a sequence of convolutional layers and pooling layers, and ends with fully-connected layers.

The convolutional, pooling and ReLU layers act as learnable features extractors, while the fully connected layers acts as a machine learning classifier.

Furthermore, the early layers of the network encode generic patterns of the images, while later layers encode the details patterns of the images.

After setting up an AWS instance, we connect to it and clone the github repository that contains the necessary Python code and Caffe configuration files for the tutorial.

There are 4 steps in training a CNN using Caffe: After the training phase, we will use the .caffemodel trained model to make predictions of new unseen data.

transform_img takes a colored images as input, does the histogram equalization of the 3 color channels and resize the image.

We need to make the modifications below to the original bvlc_reference_caffenet prototxt file: We can print the model architecture by executing the command below.

The optimization process will run for a maximum of 40000 iterations and will take a snapshot of the trained model every 5000 iterations.

After defining the model and the solver, we can start training the model by executing the command below: The training logs will be stored under deeplearning-cats-dogs-tutorial/caffe_models/caffe_model_1/model_1_train.log.

The code above stores the mean image under mean_array, defines a model called net by reading the deploy file and the trained model, and defines the transformations that we need to apply to the test images.

The code above read an image, apply similar image processing steps to training phase, calculates each class' probability and prints the class with the largest probability (0 for cats, and 1 for dogs).

Instead of training the network from scratch, transfer learning utilizes a trained model on a different dataset, and adapts it to the problem that we're trying to solve.

and “.” is dropped from the tokenized sequence, since e.g., Tokenize(“World.”) == Tokenize(“World .”) SentencePiece treats the input text just as a sequence of Unicode characters.

To handle the whitespace as a basic token explicitly, SentencePiece first escapes the whitespace with a meta symbol '▁' (U+2581) as follows.

Then, this text is segmented into small pieces, for example: [Hello] [▁Wor] [ld] [.] Since the whitespace is preserved in the segmented text, we can detokenize the text without any ambiguities.

For more detail, see Python module The following tools and libraries are required to build SentencePiece: On Ubuntu, autotools and protobuf library can be installed with apt-get: (If libprotobuf9v5 is not found, try libprotobuf-c++ instead.) On OSX, you can use brew: If want to use self-prepared protobuf library, setup below environment variables before build: Note that spm_train loads only the first --input_sentence_size sentences (default value is 10M).