A Machine Learning Guide for Average Humans

Machine learning (ML) has grown consistently in worldwide prevalence. Its implications have stretched from small, seemingly inconsequentialvictories to groundbreaking discoveries. The SEO community is no exception. An understanding and intuition of machine learning can support our understanding of the challenges and solutions Google’s engineers are facing, while also opening our minds to ML’s broader implications.

The advantages of gaining an general understanding of machine learning include:

When code works and data is produced, it’s a very fulfilling, empowering feeling (even if it’s a very humble result)

I spent a year taking online courses, reading books, and learning about learning (…as a machine). This post is the fruit borne of that labor — it covers 17 machine learning resources (including online courses, books, guides, conference presentations, etc.) comprising the most affordable and popular machine learning resources on the web (through the lens of a complete beginner). I’ve also added a summary of “If I were to start over again, how I would approach it.”

This article isn’t about credit or degrees. It’s about regular Joes and Joannas with an interest in machine learning, and who want to spend their learning time efficiently. Most of these resources will consume over 50 hours of commitment. Ain’t nobody got time for a painful waste of a work week (especially when this is probably completed during your personal time). The goal here is for you to find the resource that best suits your learning style. I genuinely hope you find this research useful, and I encourage comments on which materials prove most helpful (especially ones not included)! #HumanLearningMachineLearning

Executive summary:

Here’s everything you need to know in a chart:

*Free, but there is the cost of running an AWS EC2 instance (~$70 when I finished, but I did tinker a ton and made a Rick and Morty script generator, which I ran many epochs [rounds] of…)

Here’s my suggested program:

1. Starting out (estimated 60 hours)

Start with shorter content targeting beginners. This will allow you to get the gist of what’s going on with minimal time commitment.

2. Ready to commit (estimated 80 hours)

By this point, learners would understand their interest levels. Continue with content focused on applying relevant knowledge as fast as possible.

3. Broadening your horizons (estimated 115 hours)

If you’ve made it through the last section and are still hungry for more knowledge, move on to broadening your horizons. Read content focused on teaching the breadth of machine learning — building an intuition for what the algorithms are trying to accomplish (whether visual or mathematically).

Your next steps

By this point, you will already have AWS running instances, a mathematical foundation, and an overarching view of machine learning. This is your jumping-off point to determine what you want to do.

Why am I recommending these steps and resources?

I am not qualified to write an article on machine learning. I don’t have a PhD. I took one statistics class in college, which marked the first moment I truly understood “fight or flight” reactions. And to top it off, my coding skills are lackluster (at their best, they’re chunks of reverse-engineered code from Stack Overflow). Despite my many shortcomings, this piece had to be written by someone like me, an average person.

Statistically speaking, most of us are average (ah, the bell curve/Gaussian distribution always catches up to us). Since I’m not tied to any elitist sentiments, I can be real with you. Below contains a high-level summary of my reviews on all of the classes I took, along with a plan for how I would approach learning machine learning if I could start over. Click to expand each course for the full version with notes.

In-depth reviews of machine learning courses:

Starting out

Jason Maye’s Machine Learning 101 slidedeck: 2 years of head-banging, so you don’t have to ↓

Need to Know: A stellar high-level overview of machine learning fundamentals in an engaging and visually stimulating format.

Loved:

Very user-friendly, engaging, and playful slidedeck.

Has the potential to take some of the pain out of the process, through introducing core concepts.

Since there is a wealth of knowledge, refer back as needed (or as a grounding source).

Identify areas of interest and explore the resources provided.

{ML} Recipes with Josh Gordon ↓

Need to Know: This mini-series YouTube-hosted playlist covers the very fundamentals of machine learning with opportunities to complete exercises.

Loved:

It is genuinely beginner-focused.

They make no assumption of any prior knowledge.

Gloss over potentially complex topics that may serve as noise.

Playlist ~2 hours

Very high-quality filming, audio, and presentation, almost to the point where it had its own aesthetic.

Covers some examples in scikit-learn and TensorFlow, which felt modern and practical.

Josh Gordon was an engaging speaker.

Disliked:

I could not get Dockers on Windows (suggested package manager). This wasn’t a huge deal, since I already had my AWS setup by this point; however, a bit of a bummer since it made it impossible to follow certain steps exactly.

Issue: Every time I tried to download (over the course of two weeks), the .exe file would recursively start and keep spinning until either my memory ran out, computer crashed, or I shut my computer down. I sent this to Docker’s Twitter account to no avail.

The playlist is short (only ~1.5 hours screen time). However, it can be a bit fast-paced at times (especially if you like mimicking the examples), so set aside 3-4 hours to play around with examples and allow time for installation, pausing, and following along.

Take time to explore code labs.

Google’s Machine Learning Crash Course with TensorFlow APIs ↓

Need to Know: A Google researcher-made crash course on machine learning that is interactive and offers its own built-in coding system!

Co-author of A Practical Guide to Data Structures and Algorithms Using Java

Numerous journals, classes taught at Washington University, and contributions to the ML community

Links:

Tips on Doing:

Actively work through playground and coding exercises

OCDevel’s Machine Learning Guide Podcast ↓

Need to Know: This podcast focuses on the high-level fundamentals of machine learning, including basic intuition, algorithms, math, languages, and frameworks. It also includes references to learn more on each episode’s topic.

Kaggle Machine Learning Track (Lesson 1) ↓

Need to Know: A simple code lab that covers the very basics of machine learning with scikit-learn and Panda through the application of the examples onto another set of data.

Loved:

A more active form of learning.

An engaging code lab that encourages participants to apply knowledge.

This track offers has a built-in Python notebook on Kaggle with all input files included. This removed any and all setup/installation issues.

Side note: It’s a bit different than Jupyter notebook (e.g., have to click into a cell to add another cell).

Each lesson is short, which made the entire lesson go by very fast.

Disliked:

The writing in the first lesson didn’t initially make it clear that one would need to apply the knowledge in the lesson to their workbook.

It wasn’t a big deal, but when I started referencing files in the lesson, I had to dive into the files in my workbook to find they didn’t exist, only to realize that the knowledge was supposed to be applied and not transcribed.

Try lesson 2, which covers more complex/abstract topics (note: this second took a bit longer to work through).

Ready to commit

Fast.ai (part 1 of 2) ↓

Need to Know: Hands-down the most engaging and active form of learning ML. The source I would most recommend for anyone (although the training plan does help to build up to this course). This course is about learning through coding. This is the only course that I started to truly see the practical mechanics start to come together. It involves applying the most practical solutions to the most common problems (while also building an intuition for those solutions).

Loved:

Course Philosophy:

Active learning approach

“Go out into the world and understand underlying mechanics (of machine learning by doing).”

Counter-culture to the exclusivity of the machine learning field, focusing on inclusion.

“Let’s do shit that matters to people as quickly as possible.”

Highly pragmatic approach with tools that are currently being used (Jupyter Notebooks, scikit-learn, Keras, AWS, etc.).

Show an end-to-end process that you get to complete and play with in a development environment.

Math is involved, but is not prohibitive. Excel files helped to consolidate information/interact with information in a different way, and Jeremy spends a lot of time recapping confusing concepts.

Amazing set of learning resources that allow for all different styles of learning, including:

Video Lessons

Notes

Jupyter Notebooks

Assignments

Highly active forums

Resources on Stackoverflow

Readings/resources

Jeremy often references popular academic texts

Jeremy’s TEDx talk in Brussels

Jeremy really pushes one to do extra and put in the effort by teaching interesting problems and engaging one in solving them.

Need to Know: This book is an Amazon best seller for a reason. It covers a lot of ground quickly, empowers readers to walk through a machine learning problem by chapter two, and contains practical up-to-date machine learning skills.

Loved:

Book contains an amazing introduction to machine learning that briskly provides an overarching quick view of the machine learning ecosystem.

Immediately afterwards, Aurélien pushes a user to attempt to apply this solution to another problem, which was very empowering.

There are review questions at the end of each chapter to ensure on has grasped the content within the chapter and to push the reader to explore more.

Once installation was completed, it was easy to follow and all code is available on GitHub.

Chapters 11-14 were very tough reading; however, they were a great reference when working through Fast.ai.

Contains some powerful analogies.

Each chapter’s introductions were very useful and put everything into context. This general-to-specifics learning was very useful.

Disliked:

Installation was a common source of issues during the beginning of my journey; the text glided over this. I felt the frustration that most people experience from installation should have been addressed with more resources.

Read the introductions to each chapter thoroughly, read the chapter (pay careful attention to code), review the questions at the end (highlight any in-text answer), make a copy of Aurélien’s GitHub and make sure everything works on your setup, re-type the notebooks, go to Kaggle and try on other datasets.

Broadening your horizons

Udacity: Intro to Machine Learning (Kate/Sebastian) ↓

Need to Know: A course that covers a range of machine learning topics, supports building of intuition via visualization and simple examples, offers coding challenges, and a certificate (upon completion of a final project). The biggest challenge with this course is bridging the gap between the hand-holding lectures and the coding exercises.

Loved:

Focus on developing a visual intuition on what each model is trying to accomplish.

This visual learning mathematics approach is very useful.

Cover a vast variety and breadth of models and machine learning basics.

In terms of presenting the concept, there was a lot of hand-holding (which I completely appreciated!).

Many people have done this training, so their GitHub accounts can be used as reference for the mini-projects.

Andrew Ng’s Coursera Machine Learning Course ↓

Need to Know: The Andrew Ng Coursera course is the most referenced online machine learning course. It covers a broad set of fundamental, evergreen topics with a strong focus in building mathematical intuition behind machine learning models. Also, one can submit assignments and earn a grade for free. If you want to earn a certificate, one can subscribe or apply for financial aid.

Loved:

This course has a high level of credibility.

Introduces all necessary machine learning terminology and jargon.

Contains a very classic machine learning education approach with a high level of math focus.

Quizzes interspersed in courses and after each lesson support understanding and overall learning.

The sessions for the course are flexible, the option to switch into a different section is always available.

Disliked:

The mathematic notation was hard to process at times.

The content felt a bit dated and non-pragmatic. For example, the main concentration was MATLAB and Octave versus more modern languages and resources.

Mike King has a few slide decks on the basics of machine learnings and AI

iPullRank has a few data scientists on staff

Links:

Tips on Reading:

Read chapters 1-6 and the rest depending upon personal interest.

Review Google PhD ↓

Need to Know: A two-hour presentation from Google’s 2017 IO conference that walks through getting 99% accuracy on the MNIST dataset (a famous dataset containing a bunch of handwritten numbers, which the machine must learn to identify the numbers).

Loved:

This talk struck me as very modern, covering the cutting edge.

Found this to be very complementary to Fast.ai, as it covered similar topics (e.g. ReLu, CNNs, RNNs, etc.)

Amazing visuals that help to put everything into context.

Disliked:

The presentation is only a short conference solution and not a comprehensive view of machine learning.

Started Mobipocket, a startup that later became the software part of the Amazon Kindle and its mobile variants

Links:

Tips on Watching:

Google any concepts you’re unfamiliar with.

Take your time with this one; 2 hours of screen time doesn’t count all of the Googling and processing time for this one.

Caltech Machine Learning iTunes ↓

Need to Know: If math is your thing, this course does a stellar job of building the mathematic intuition behind many machine learning models. Dr. Abu-Mostafa is a raconteur, includes useful visualizations, relevant real-world examples, and compelling analogies.

Loved:

First and foremost, this is a real Caltech course, meaning it’s not a watered-down version and contains fundamental concepts that are vital to understanding the mechanics of machine learning.

On iTunes, audio downloads are available, which can be useful for on-the-go learning.

Dr. Abu-Mostafa is a skilled speaker, making the 27 hours spent listening much easier!

Dr. Abu-Mostafa offers up some strong real-world examples and analogies which makes the content more relatable.

As an example, he asks students: “Why do I give you practice exams and not just give you the final exam?” as an illustration of why a testing set is useful. If he were to just give students the final, they would just memorize the answers (i.e., they would overfit to the data) and not genuinely learn the material. The final is a test to show how much students learn.

The last 1/2 hour of the class is always a Q&A, where students can ask questions. Their questions were useful to understanding the topic more in-depth.

The video and audio quality was strong throughout. There were a few times when I couldn’t understand a question in the Q&A, but overall very strong.

This course is designed to build mathematical intuition of what’s going on under the hood of specific machine learning models.

Professor of Electrical Engineering and Computer Science at the California Institute of Technology

Chairman of Machine Learning Consultants LLC

Serves on a number of scientific advisory boards

Has served as a technical consultant on machine learning for several companies (including Citibank).

Multiple articles in Scientific American

Links:

Tips on Watching:

Consider listening to the last lesson first, as it pulls together the course overall conceptually. The map of the course, below, was particularly useful to organizing the information taught in the courses.

“Pattern Recognition & Machine Learning” by Christopher Bishop ↓

Need to Know: This is a very popular college-level machine learning textbook. I’ve heard it likened to a bible for machine learning. However, after spending a month trying to tackle the first few chapters, I gave up. It was too much math and pre-requisites to tackle (even with a multitude of Google sessions).

Loved:

The text of choice for many major universities, so if you can make it through this text and understand all of the concepts, you’re probably in a very good position.

I appreciated the history aside sections, where Bishop talked about influential people and their career accomplishments in statistics and machine learning.

Despite being a highly mathematically text, the textbook actually has some pretty visually intuitive imagery.

Disliked:

I couldn’t make it through the text, which was a bit frustrating. The statistics and mathematical notation (which is probably very benign for a student in this topic) were too much for me.

Udacity: Machine Learning by Georgia Tech ↓

Need to Know: A mix between an online learning experience and a university machine learning teaching approach. The lecturers are fun, but the course still fell a bit short in terms of active learning.

Loved:

This class is offered as CS7641 at Georgia Tech, where it is a part of the Online Masters Degree. Although taking this course here will not earn credit towards the OMS degree, it’s still a non-watered-down college teaching philosophy approach.

Covers a wide variety of topics, many of which reminded me of the Caltech course (including: VC Dimension versus Bayesian, Occam’s razor, etc.)

Discusses Markov Decision Chains, which is something that didn’t really come up in many other introductory machine learning course, but they are referenced within Google patents.

The lecturers have a great dynamic, are wicked smart, and displayed a great sense of (nerd) humor, which make the topics less intimidating.

The course has quizzes, which give the course a slight amount of interaction.

Disliked:

Some videos were very long, which made the content a bit harder to digest.

The course overall was very time consuming.

Despite the quizzes, the course was a very passive form of learning with no assignments and little coding.

Many videos started with a bunch of content already written out. Having the content written out was probably a big time-saver, but it was also a bit jarring for a viewer to see so much information all at once, while also trying to listen.

It’s vital to pay very close attention to notation, which compounds in complexity quickly.

Tablet version didn’t function flawlessly: some was missing content (which I had to mark down and review on a desktop), the app would crash randomly on the tablet, and sometimes the audio wouldn’t start.

There were no subtitles available on tablet, which I found not only to be a major accessibility blunder, but also made it harder for me to process (since I’m not an audio learner).

Andrew Ng’s Stanford’s Machine Learning iTunes ↓

Need to Know: A non-watered-down Stanford course. It’s outdated (filmed in 2008), video/audio are a bit poor, and most links online now point towards the Coursera course. Although the idea of watching a Stanford course was energizing for the first few courses, it became dreadfully boring. I made it to course six before calling it.

Loved:

Designed for students, so you know you’re not missing out on anything.

This course provides a deeper study into the mathematical and theoretical foundation behind machine learning to the point that the students could create their own machine learning algorithms. This isn’t necessarily very practical for the everyday machine learning user.

Has some powerful real-world examples (although they’re outdated).

There is something about the kinesthetic nature of watching someone write information out. The blackboard writing helped me to process certain ideas.

Disliked:

Video and audio quality were pain to watch.

Many questions asked by students were hard to hear.

On-screen visuals range from hard to impossible to see.

Found myself counting minutes.

Dr. Ng mentions TA classes, supplementary learning, but these are not available online.

Sometimes the video showed students, which I felt was invasive.

Lecturer:

Andrew Ng (see above)

Links:

Tips on Watching:

Only watch if you’re looking to gain a deeper understanding of the math presented in the Coursera course.

Skip the first half of the first lecture, since it’s mostly class logistics.

Additional Resources

Motivations and inspiration

If you’re wondering why I spent a year doing this, then I’m with you. I’m genuinely not sure why I set my sights on this project, much less why I followed through with it. I saw Mike King give a session on Machine Learning. I was caught off guard, since I knew nothing on the topic. It gave me a pesky, insatiable curiosity itch. It started with one course and then spiraled out of control. Eventually it transformed into an idea: a review guide on the most affordable and popular machine learning resources on the web (through the lens of a complete beginner). Hopefully you found it useful, or at least somewhat interesting. Be sure to share your thoughts or questions in the comments!