Handwritten Digit Prediction using Convolutional Neural Networks in TensorFlow with Keras and Live Example using TensorFlow.js

Whenever we start learning a new programming language we always start with Hello World Program. Likewise, most AI/ML developers say “Just like programming has Hello World, machine learning has MNIST”.

Like everyone, I wanted to start from there. In fact, I wanted to write my first article/story related ML on MNIST but that didn’t sound exciting because the internet has loads of MNIST articles. I want my article/story different from others so I thought with code why can’t I share a live example also?

Let’s get started. I hope you have TensorFlow, Keras in your system if not please read my previous article. It has instructions about how to install
them.

The input shape that CNN expects is a 4D array (batch, height, width, channels). Channels signify whether the image is grayscale or colored. In our case, we are using grayscale images so we give 1 for channels if these are colored images we give 3(RGB). Below code for reshaping our inputs.

It’s always good to normalize data. Our Datasets will have data in each pixel in between 0–255 so now we scale it to 0–1 using below code.

# normalize inputs from 0-255 to 0-1
X_train/=255
X_test/=255

Our output ranges between 0–9. So, its a multi-class classification problem. All values(output) are equal to us so it’s better to use one-hot encoding. One-hot encoding transforms integer to a binary matrix where the array contains only one ‘1’ and the rest elements are ‘0’.

For example, we are expecting output as 8 means value of output variable 8 so according to one-hot coding its [0,0,0,0,0,0,0,0,1,0]

The first hidden layer is a convolutional layer called a Convolution2D. The layer has 32 filters/output channels, which with the size of 5×5 and an activation function. This is the input layer, expecting images with the structure outlined above (height, width, channels).

The Second layer is the MaxPooling layer. MaxPooling layer is used to down-sample the input to enable the model to make assumptions about the features so as to reduce over-fitting. It also reduces the number of parameters to learn, reducing the training time.

One more hidden layer with 32 filters/output channels with the size of 3×3 and an activation function.

One more MaxPooling layer.

The next layer is a regularization layer using dropout called Dropout. It is configured to randomly exclude 20% of neurons in the layer in order to reduce overfitting.

Next layer converts the 2D matrix data to a vector called Flatten. It allows the output to be processed by standard fully connected layers.

Next layer is a fully connected layer with 128 neurons.

Next(last) layer is output layer with 10 neurons(number of output classes) and it uses softmax activation function. Each neuron will give the probability of that class. It’s a multi-class classification that’s why softmax activation function if it was a binary classification we use sigmoid activation function.

Let’s compile the model. I used categorical_crossentropy as a loss function because its a multi-class classification problem. I used Adam as Optimizer to make sure our weights optimized properly. I used accuracy as metrics to improve the performance of our neural network.

It’s time for our model training. The model is going to fit over 10 epochs and updates after every 200 images training. The test data is used as the validation dataset, allowing you to see the skill of the model as it trains.

I got around 99.19% accuracy. You will find this example code with name mnistCNN.py at my GitHub repository.

After completing this I didn’t get satisfaction because it ran on the data provided by Keras. I want to verify my trained model on my own data. So I created a couple of images by myself & stored the images in my data folder and then checked with my model. Results looked decent. Code for this

You will find above code, images & model file at at my GitHub repository. To run above code you need Pillow Package. You need to run below command to get the package.

pip3 install pillow

But still, I am not satisfied so I thought let’s do something more. We all know Google introduced TensorFlow.js. I read that we can use our existing
model also. So I thought why not build a small page for this example. From here journey became more excited.

First, we need canvas where the user can draw a number. For this, I wrote an HTML with the help of this article.

Now we want our model to be used at browser level for that we need to convert into the format by which TensorFlow.js can consume. For this task, this article helped me. To convert Keras model to TensorFlow js consumable model we need tensorflowjs_converter. For this we need to install tensorflowjs package.

Now a model file & a couple of supporting files for the model will be created at models folder. With these(model.json, group1-shard1of1, group2-shard1of1, group3-shard1of1, group4-shard1of1) names. These are going to help us to use our Trained DL(Deep Learning) model.

Now I am going to reveal our secret ingredient for this story

I am going to explain 3 important things here rest all are fairly straightforward. It all starts with TensorFlow.js script include. Need to include TensorFlow.js for that add below line to your HTML file.