Issue #128

May 5 2016

Editor Picks

The Special Relationship Between Noodles and QdobaI’ve had a theory that for every Noodles, there’s a Qdoba that’s right next door. It might be some sort of selection bias however, since I can think of a couple locations where they’re directly next to each other. To me, Noodles and Qdoba have a special relationship, at least compared to other restaurants. I figured now was about the time I should test this, and I can use Chipotle to test...

A Neural Network that Dreams in ResumesIf a neural network can write Shakespeare, could it write a resume for you? Inspired by the remarkable results of Recurrent Neural Networks and using thousands of anonymized resumes from untapt, I’ve been experimenting with applying deep learning techniques to the CV...

A Message from this week's Sponsor:

Whitepaper: A Practical Guide to Building Data Driven Products Beyond Analysts' Laptops via @YhatHQ
Learn how to apply data science insights to the real world. Discover the implications beyond analysts’ laptops and answer the question of what to do with predictive models once they’re built.

Data Science Articles & Videos

How to get into the top 15 of a Kaggle competition using PythonDoing well in a Kaggle competition requires more than just knowing machine learning algorithms. It requires the right mindset, the willingness to learn, and a lot of data exploration. Many of these aspects aren’t typically emphasized in tutorials on getting started with Kaggle, though. In this post, I’ll cover how to get started with the Kaggle Expedia hotel recommendations competition, including establishing the right mindset, setting up testing infrastructure, exploring the data, creating features, and making predictions...

Finding Similar Music using Matrix FactorizationThis post is a step by step guide on how to calculate related artists using a couple of different matrix factorization algorithms. The code is written in Python using Pandas and SciPy to do the calculations and D3.js to interactively visualize the results...

[Video] How Machine Learning Amplifies Inequality in SocietyIn this talk, Mike Williams, Research Engineer at Fast Forward Labs, looks at how supervised machine learning has the potential to amplify power and privilege in society. Using sentiment analysis, he demonstrates how text analytics often favors the voices of men. Mike discusses how bias can inadvertently be introduced into any model, and how to recognize and mitigate these harms...

Neural Networks Are Impressively Good At CompressionI hope I have given you an intuition for how neural networks can compress patterns in few weights. They use the full range of the weights to the point where a connection activated with a strong input can mean something entirely different than the same connection activated with a weak input. And best of all I didn’t have to teach them to do this. They just start behaving like this if you force them to express a complex pattern in few connections...

Artistic style transfer for videos
We present an approach that transfers the style from one image (for example, a painting) to a whole video sequence. Supplementary video accompanying the paper...

Terrain-Adaptive Locomotion Skills
Using Deep Reinforcement Learning
Reinforcement learning offers a promising methodology for developing skills for simulated characters, but typically requires working with sparse hand-crafted features. Building on recent progress in deep reinforcement learning (DeepRL), we introduce a mixture of actor-critic experts (MACE) approach that learns terrain-adaptive dynamic locomotion skills using high-dimensional state and terrain descriptions as input, and parameterized leaps or steps as output actions...

The Descriptor Protocol, and Python Black MagicSince I graduated last summer, I have been writing lots of both Python 2 and 3. This snippet seemed like something I should understand well. However, I did not, so this post is an attempt to solve that...

Jobs

SimpleReach is seeking a seasoned data scientist to join our ranks. This mathematically savvy individual will be on the front lines, wrangling data and investigating our massive stores of traffic events while also building machines to intelligently classify content and build recommendation engines for a wide range of applications...

Training & Resources

An Introduction to Scientific Python (and a Bit of the Maths Behind It) - MatplotlibIn this series of posts, we will take a look at the main libraries used in scientific Python and learn how to use them to bend data to our will. We won't just be learning to churn out template code however, we will also learn a bit of the maths behind it so that we can understand what is going on a little better. So let's kick things off with a incredibly useful little number that we will be using throughout this series of posts; Matplotlib...

D3 Basic Pie Chart Video TutorialYou will use the CSV data from the D3js.org website Pie Chart example to see how a full D3 Pie Chart example data visualization is built...

Identify, describe, plot, and remove the outliers from the datasetThere are different methods to detect the outliers, including standard deviation approach and Tukey’s method which use interquartile (IQR) range approach. In this post I will use the Tukey’s method because I like that it is not dependent on distribution of data...

Books

"This book is a treasure trove of intuitive, practical, and brilliant mathematical techniques. Every person with an interest in mathematics, science, or engineering will enjoy this highly stimulating and fun book.""