Applied Machine Learning: The Less Confusing Guide

For the past two years or so I’ve had an on-again, off-again fascination with machine learning. As with most things you learn, I’ve had to constantly re-learn certain concepts and lessons every time I’ve decided to try sometime new. So, I’ve compiled a definitive collection of key concepts, definitions, resources, and tools that I find useful when navigating this complex field. Hopefully, this will be of use to others getting into the wonderful (and sometimes frustrating) world of machine learning.

Index

How best to learn machine learning

Honestly, to each to their own. What works for me has always been repeatedly applying what I learn in projects until I get good at it. Why learn it if you aren’t going to use it? Plus the dopamine hits off of seeing the fruits of your learning tend to keep you motivated.

It’s easy to get overwhelmed trying to learn machine learning. There’s a lot to learn and too many resources that teach the same content in different and sometimes confusing approaches. It’s also tough to learn everything about such a vast and rapidly evolving topic. Ideally, once you feel you’ve got a sound introduction to machine learning, figure out what specific area(s) you want to specialize in and do your own research.

I’m going to try and organize this sparse collection of knowledge with enough context and information for it to make sense to anyone reading. When in doubt Google…

What is machine learning (ML)?

It’s a sub-area of artificial intelligence that allows computers to self-learn without having to be explicitly programmed. ML essentially aims to understand patterns in large sets of input data and then predict outputs based on the models it generate.

The machine learning workflow

What is a machine learning algorithm? Essentially machine learning employs algorithms that can learn from and make predictions on data. These are typically borrowed from statistics and range from simple regression algorithms to decision trees and more.

What is a machine learning model? Generally, it refers to the model artifact created after training an ML algorithm. Once you have a properly trained ML model you can use it to make predictions for new inputs. The goal of machine learning is to properly train ML algorithms to create such models. When I say ‘models’ in this post I’m always referring to this definition.

But, there really isn’t a single consistent definition of the term ‘model’ within the ML community. The term gets thrown around a lot and can refer to anything from statistical models or data models used in ML; like columns, data types and sources or even specifications of neural nets. Be wary of this when you read up on ML across technical and mathematical guides.

Popular ML algorithms

There’s a lot, and each one has its own set of appropriate use cases. You can classify ML algorithms based on learning style or similarity. The diagram below (open in new tab) does a great job of summarising the popular ones by similarity. For the purposes of this post, I’ll group them based on learning style: supervised and unsupervisedlearning.

Supervised learning

This is where the machine learning algorithm is trained using example scenarios. The training data comes tagged with known labels that allow the algorithms to build a model based on it. Once the model is trained sufficiently the algorithm will be able to determine the labels for unseen instances.

Problems solved with supervised learning can be further broken down into classification and regression problems.

Unsupervised learning

In contrast to supervised learning, unsupervised learning uses training data that is not labeled. This essentially means the algorithm figures out how to make sense (recognize patterns) of the data on its own.

Unsupervised learning can be grouped into clustering and association problems.

Semi-Supervised learning

This is a mix of the two previous approaches – only some of the input data is labeled.

Linear Regression for Supervised Learning

This is essentially the “Hello World” tutorial for machine learning. Linear regression is used to understand the relationship between input (x) and output (y) variables. When there is only one input variable (x), it’s called simple linear regression. You’ve probably seen this technique used in simple statistics.

The most common technique used to train a linear regression equation is called Ordinary Least Squares. So, when we use this process to train a model in machine learning it’s usually referred to as Ordinary Least Squares Linear Regression.

A simple regression model for input (x) and output (y) can be modeled as such:

y = B0 + B1*x

The coefficient B1 (beta) is an estimate of the regression slope, and the additional coefficient B0 estimates the regression intercept giving the line an additional degree of freedom.

You’ll soon notice that a lot of machine learning these days are just different ways of curve fitting using basic statistics. Machine learning (at least in my opinion) only gets really exciting when you step into the world of deep learning.

Deep learning

This is a sub-field of machine learning that’s shown a lot of promise in recent years. It’s concerned with algorithms that are based on the structure and function of neurons in the brain.

Slide by Andrew Ng, all rights reserved.

One of the most exciting features of deep learning is its performance in feature learning; the algorithms perform particularly well in being able to detect features from raw data. A good example is a deep learning model’s ability to do things like identify the wheels from the image of a car. The diagram below illustrates the difference between typical machine learning and deep learning:

Deep learning usually consists of multiple layers. They typically combine simpler models to build more complicated ones by passing along data from one layer to another; which is one of the primary reasons deep learning outperforms other learning algorithms as the amount of data increases.

For a definitive introduction to deep learning read The Deep Learning Book available online for free through MIT.

TensorFlow is a Python library for fast numerical computing that was designed specifically for machine learning. It was open-sourced by Google with the hope of putting deep learning capabilities in the hands of a lot more researchers and developers around the world.

How to use TensorFlow

Once installed it provides multiple APIs for training ML models. The higher level APIs built on top of what is called the TensorFlow Core (the lowest level API with most control) are the easiest to learn and should be where you start.

It’s counterintuitive to include a full TensorFlow tutorial within this post when there already exists countless resources online that do it perfectly well… start with the official one:

While TensorFlow is the most popular machine learning library, there’re several great alternatives like Torch (used by Facebook), Caffe(deep learning framework by Berkeley AI Research) and many more.

What’s missing in this post

A lot. The goal here was to give you a definitive foundation in machine learning with minimal confusion. This topic is far too huge to document in full.

What next?

Once you understand the basics thoroughly you should have some idea of what your interest in machine learning is; do you want to use it for your app? or for research?

Based on your interests you should be able to dwelve deeper into any of the different areas by following either the many links embedded on this post or dig up what you need with a few Google searches.

The hard part is getting a good foundation in machine learning. After that, it’s a matter of knowing what you want to accomplish.

In closing…

“The world is one big data problem.” – Andrew McAfee

Machine learning can be a tough beast to master. But if you’ve read up to this point you’d agree it’s an extremely valuable asset to have by your side.

“Torture the data, and it will confess to anything.” – Ronald Coase

Be cautious when you apply ML in the field – given the inherent nature of these algorithms, it can sometimes be confusing to tell if the algorithm comes to a conclusion by following a meaningful set of steps that can be replicated or simply arrived at a result that ‘smells right’ through a faulty process.