The InfoQ eMag: Introduction to Machine Learning

About the Author

InfoQ.com is facilitating the spread of knowledge and innovation in professional software development. InfoQ content is currently published in English, Chinese, Japanese and Brazilian Portuguese. With a readership base of over 1,400,000 unique visitors per month reading content from 100 locally-based editors across the globe, we continue to build localized communities.

InfoQ Weekly Newsletter

Join a community of over 250 K senior developers by signing up for our newsletter. If you are based in the EEA, please contact us so we can provide you with the protections afforded to you under EEA protection laws.

Machine learning has long powered many products we interact with daily—from "intelligent" assistants like Apple's Siri and Google Now, to recommendation engines like Amazon's that suggest new products to buy, to the ad ranking systems used by Google and Facebook.

More recently, machine learning has entered the public consciousness because of advances in "deep learning"—these include AlphaGo's defeat of Go grandmaster Lee Sedol and impressive new products around image recognition and machine translation.

While much of the press around machine learning has focused on achievements that were not previously possible, the full range of machine learning methods—from traditional techniques that have been around for decades to more recent approaches with neural networks—can be deployed to solve many important (but perhaps more prosaic) problems that businesses face. Examples of these applications include, but are by no means limited to, fraud prevention, time-series forecasting, and spam detection.

InfoQ has curated a series of articles for this introduction to machine learning eMagazine covering everything from the very basics of machine learning (what are typical classifiers and how do you measure their performance?), to production considerations (how do you deal with changing patterns in data after you’ve deployed your model?), to newer techniques in deep learning. After reading through this series, you should be ready to start on a few machine learning experiments of your own.

The Introduction to Machine Learning eMag include:

Introduction to Machine Learning with Python - We begin with the basics, using a concrete problem to frame the discussion: how can we detect credit card fraud using machine learning? We’ll discuss feature encoding, various types of models (logistic regression, decision trees, and random forests), and measures of model performance (precision, recall, and ROC curves). We’ll build models using popular open source libraries available for Python and include essentially all the code you’ll need to develop similar models yourself.

Practicing Machine Learning with Optimism - Using machine learning to solve real-world problems often presents challenges that weren't initially considered during the development of the machine learning method. In the next article, Alyssa Frazee addresses a few examples of such issues: how do you obtain confidence intervals around uncertain estimates, how do you update and retrain your models when the models themselves are changing the world (and the data you have available), and how do you explain the seemingly black-box decisions that models make?

Anomaly Detection for Time Series Data with Deep Learning - We take a detour from traditional machine learning techniques and problems to introduce deep learning—machine learning models which derive their name from the similarity the models have to the connections between neurons in the brain. Tom Hanlon discusses the various types of neural networks (feed-forward, convolutional, and recurrent) and describes how to build a recurrent neural network that detects anomalies in time series data. To make the discussion concrete, Tom uses Deeplearning4j, a popular open-source deep-learning library for the JVM, in his examples.

Real-World, Man-Machine Algorithms - In this article, Edwin Chen and Justin Palmer talk about the end-to-end flow of developing machine learning models. Kaggle competitions may lead you to believe that the hard part of machine learning is just in the algorithm tuning, but in reality there are a host of problems to address before and after the algorithmic part: where do you get the labels for your data? And how do you address changes in those labels over time? Approaches to these and similar issues in model “lifecycle” management are discussed.

Book Review: Andrew McAfee and Erik Brynjolfsson's The Second Machine Age - As machine learning becomes increasingly prevalent, society will have to address the impact the technology has on workers who might be displaced. In their book The Second Machine Age, Andrew McAfee and Erik Brynjolfsson discuss some of the potential effects that artificial intelligence and related technologies will have, particularly on economic inequality, and propose policy interventions to mitigate the negative impact.

InfoQ Weekly Newsletter

Join a community of over 250 K senior developers by signing up for our newsletter. If you are based in the EEA, please contact us so we can provide you with the protections afforded to you under EEA protection laws.

Is your profile up-to-date? Please take a moment to review and update.

Email Address

Note: If updating/changing your email, a validation request will be sent

Company name:

Keep current company name

Update Company name to:

Company role:

Keep current company role

Update company role to:

Company size:

Keep current company Size

Update company size to:

Country/Zone:

Keep current country/zone

Update country/zone to:

State/Province/Region:

Keep current state/province/region

Update state/province/region to:

Subscribe to our newsletter?

Subscribe to our architect newsletter?

Subscribe to our industry email notices?

You will be sent an email to validate the new email address. This pop-up will close itself in a few moments.

We notice you're using an ad blocker

We understand why you use ad blockers. However to keep InfoQ free we need your support. InfoQ will not provide your data to third parties without individual opt-in consent. We only work with advertisers relevant to our readers. Please consider whitelisting us.