Resources for getting started with ML and DL

January 3, 2018

The following is a collection of resources that I found useful when I started to learn machine learning and deep learning.
In this post I’m using the term machine learning to refer to classical machine learning and the term deep learning
to refer to machine learning with deep neural networks. There are numerous other good resources out there which are not
mentioned here. This doesn’t mean I consider them as inferior, it’s just that I haven’t used them and therefore can’t
comment on.

First steps

If you are completely new to machine learning I recommend starting with the outstanding Stanford
Machine Learning course by Andrew Ng. It is easy to follow and covers
topics that every machine learning engineer really should know. The course uses Octave (an open source alternative to MATLAB)
for programming. Algorithms are implemented from scratch in order to get a better understanding how they work. The course
is also a good preparation for the Deep Learning specialization
at Coursera.

After having taken the course I felt the need to learn Python and re-implement the exercises with scikit-learn.
Scikit-learn is a Python machine learning library that provides optimized and easy-to-use implementations for all algorithms
presented in the course. I published the results as machine-learning-notebooks
project on GitHub.

Further courses

Deep Learning specialization. This specialization consists of
five courses, tought by Andrew Ng, covering deep neural network basics, regularization and optimization and models for
computer vision and sequences (text, speech, …). If you enjoyed the quality and accessibility of Andrew’s
Machine Learning course you will probably like this course too. It
provides you with the skills needed to follow more advanced literature in that field, including research papers.

The initial programming exercises for the basics are in plain Python/numpy to get a better understanding how forward
and backward propagation work. Models for computer vision are implemented with Tensorflow
and Keras. Many examples cover recent research literature from 2014 or newer (ResNet, GoogLeNet,
FaceNet, … and many more). The last course on sequence models wasn’t available yet at the time of writing this post.

A good understanding of statistical inference basics is important to get more out of the machine learning, deep learning
and statistics literature listed further below. If you need a refresher on statistical inference basics then the following
courses might be helpful:

Bayesian statistics. Many advanced machine learning and deep
learning techniques are based on Bayesian inference. The course
teaches the basics (Bayes’ rule, conjugate models, Bayesian inference on discrete and continuous data, …) and compares
them to the frequentist approach. Other basics such as Markov Chain Monte Carlo (MCMC) and hierarchical models are not
covered though. A good companion to this course is the book
Doing Bayesian data analysis (see below).
Before taking this course, familiarity with the frequentist approach is helpful.

Books

Machine learning - a probabilistic perspective. A comprehensive
book on classical machine learning techniques. Its focus is rather theoretical and the descriptions are math-heavy. All
concepts are explained in an excellent way and therefore rather easy to follow even for machine learning beginners, given
basic familiarity with multivariate calculus, probability and linear algebra. The book covers both the frequentist and
Bayesian approach to inferring parameters of statistical models. Code examples are in MATLAB but there is also a
Python port available.

Deep learning. A comprehensive book on deep learning techniques. Part 1 covers
machine learning basics. Part 2 covers deep neural network basics, convolutional neural networks (CNNs) and recurrent
neural networks (RNNs). The content is comparable to that of the
Deep Learning specialization but is presented in a more academic
way. Part 3 covers more advanced topics such as auto-encoders, representation learning and deep generative models. This
is not a book for a practitioners but one of the best deep learning overview books I’ve seen.

Hands-on machine learning with scikit-learn and Tensorflow. If
you’ve already taken a first machine learning and deep learning course, this book is for you. It is packed with useful
code examples and guidelines for real-world machine learning projects. Part 1 focuses on the implementation of classical
machine learning models with scikit-learn. Part 2 focuses on deep learning with Tensorflow. In addition to CNNs and RNNs
this part also has chapters on auto-encoders and reinforcement learning. Both, theory and code examples are presented
in a clear and concise way.

Deep learning with Python. Another excellent deep learning
book for practitioners with code examples using Keras. Keras is a deep learning framework with a higher-level API than
Tensorflow that aims to enable rapid prototyping. In addition to a detailed coverage of CNNs and RNNs this book also has
chapters on advanced deep learning best practices and generative deep learning. It is a good complement to part 2 of the
previous books (from a tools perspective). If you are not sure which one is better to start with, I recommend this one as
first steps are easier with Keras than with Tensorflow in my opinion.

Doing Bayesian data analysis. An excellent
introduction to Bayesian statistics that prepares you well for reading more advanced literature in that field. It covers
the Bayesian analogues to traditional statistical tests (t, ANOVA, Chi-squared, …) and to multiple linear and logistic
regression among many others. It requires only a basic knowledge of calculus. For me, the book was a helpful companion to
the books Machine learning - a probabilistic perspective and
Deep learning. Code examples are written in R using packages
JAGS and Stan for MCMC sampling. There’s
also a Python port available using PyMC3.

Data Science from Scratch. This book is about data science in its
most distilled form. Don’t expect too much depth here but a great overview of data science topics such as probability and
statistics, data preparation and machine learning basics. The book focuses on understanding fundamental data science tools
by implementing them in plain Python from scratch. Well-known statistical and machine learning libraries are not used here
but each chapter contains references to libraries you should actually use for your own projects and links for further reading.

There is also a large number of useful blogs and survey papers about machine learning which I’ll leave for a separate
post. I nevertheless hope you find this a useful guide for getting started with machine learning.