Constructing a Conceptual Artificial Neural Network (continued)

In the previous article, we looked at how to construct a very simple artificial neural network to model our chicken-or-beef meal decision. However, this is overly simplistic for most 'real-world' problems. In this final article, we'll look at how to extend this basic network, and how to deal with that ever-present bugbear in machine learning: over-fitting.

Adding Hidden Layers

The multilayer perceptron (MLP) model we constructed in the previous article was very simple, with a small number of layers. Each layer focuses on applying a certain function on a set of features and can be considered as a very abstract filter. Each successive layer in a feed-forward network will look at approximations (representations) of features with higher and higher levels of abstraction. This ability to increase the abstraction level of features is the key to the amazing power and results seen with machine learning models based on ANNs.

Once we increase the number of layers beyond the input and output, the layers no longer have access to the outside world, and are hence hidden. We may add a hidden layer by simply repeating most of the initial part:

Adding a layer usually decreases the validation (test) loss dramatically and improves the accuracy. Adding this layer increased the network's depth. As the number of hidden layers increase, the network becomes deeper; this is what is referred to as Deep Learning.

Network Dropout

Over-fitting is an ever-present issue with machine learning models. One means of reducing over-fitting is to induce network dropout. This involves selecting a subset of the model inputs at random during each training phase. This is simply done in Keras, setting the rate parameter:keras.layers.Dropout(0.2) # Induces a 20% drop-out rate

Appropriate Applications of Artificial Neural Networks

Because ANNs are 'universal approximators,' as well as based on a large number of small, simple, units, they are great where:

The relationships between variables are poorly understood or analytically complex

There is a lot of data

They are not so great because:

Principal features are not explicitly apparent; decisions are opaque for deep networks

They can be slow to train, requiring several epochs

Conclusion

This was a very brief introduction to the field of artificial neural networks (ANNs) and Deep Learning!

We have examined the theoretical justification for (ANNs), demonstrating that they are great 'universal approximators'. We also covered their use-cases and some of their pitfalls.

We also had a brief introduction to TensorFlow and Keras. We built a feed-forward network (a Multi-Layer Perceptron; MLP) to approximate complex functions. We 'tweaked' this model, improving the output, and evaluated its performance. In order to determine this, we also covered the concepts of appropriate activation functions, optimizers and cost functions.

Perhaps most importantly: we have figured out how to be satisfied with a simple meal at an altitude of 30,000 feet. People have been trying to achieve this for years!

Additional Resources

There are plenty of great resources available to learn about artificial neural networks, from the deeply theoretical to the immensely practical. Here is a brief selection of some suggested resources to learn more on this popular and useful topic.