Ways to think about machine learningWe're now four or five years into the current explosion of machine learning, and pretty much everyone has heard of it. It's not just that startups are forming every day or that the big tech platform companies are rebuilding themselves around it - everyone outside tech has read the Economist or BusinessWeek cover story, and many big companies have some projects underway. We know this is a Next Big Thing...

Given a satellite image, machine learning creates the view on the groundLeonardo da Vinci famously created drawings and paintings that showed a bird’s eye view of certain areas of Italy with a level of detail that was not otherwise possible until the invention of photography and flying machines. Indeed, many critics have wondered how he could have imagined these details. But now researchers are working on the inverse problem: given a satellite image of Earth’s surface, what does that area look like from the ground? How clear can such an artificial image be?...

A Message from this week's Sponsor:

Are package acquisitions and approval processes slowing your ability to develop and deliver innovative models? A lot of your pain can be equated to the Model Myth - the notion that models should be treated like data or other digital assets like software. Models are fundamentally different and require a framework that embraces their differences. Read this paper to understand why models can’t be managed like other digital assets and to learn how to build this new organizational capability that is essential to remaining competitive in a model-driven world.

Data Science Articles & Videos

Troubling Trends in Machine Learning ScholarshipThis paper aims to instigate discussion, answering a call for papers from the ICML Machine LearningDebates workshop. While we stand by the points represented here, we do not purport to offer a fullor balanced viewpoint or to discuss the overall quality of science in ML...

What Image Classifiers Can Do About Unknown ObjectsA few days ago I received a question from Plant Village, a team I’m collaborating with about a problem that’s emerged with a mobile app they’re developing. It detects plant diseases, and is delivering good results when it’s pointed at leaves, but if you point it at a computer keyboard it thinks it’s a damaged crop. This isn’t a surprising result to computer vision researchers, but it is a shock to most other people, so I want to explain why it’s happening, and what we can do about it....

State of AIIn this report, we set out to capture a snapshot of the exponential progress in AI with a focus on developments in the past 12 months. Consider this report as a compilation of the most interesting things we’ve seen that seeks to trigger informed conversation about the state of AI and its implication for the future...

An Intriguing Failing of Convolutional Neural Networks and the CoordConv SolutionFew ideas have enjoyed as large an impact on deep learning as convolution. For any problem involving pixels or spatial representations, common intuition holds that convolutional neural networks may be appropriate. In this paper we show a striking counterexample to this intuition via the seemingly trivial coordinate transform problem, which simply requires learning a mapping between coordinates in (x,y) Cartesian space and one-hot pixel space...

Why businesses fail at machine learningI’d like to let you in on a secret: when people say ‘machine learning’ it sounds like there’s only one discipline here. There are two, and if businesses don’t understand the difference, they can experience a world of trouble...

Tracking the Progress in Natural Language ProcessingResearch in Machine Learning and in Natural Language Processing (NLP) is moving so fast these days, it is hard to keep up...A number of resources exist that could help with this process, but each has deficits...As an alternative, I have created a GitHub repository that keeps track of the datasets and the current state-of-the-art for the most common tasks in NLP. The repository is kept as simple as possible to make maintenance and contribution easy. If I missed your favourite task or dataset or your new state-of-the-art result or if I made any error, you can simply submit a pull request...

Jobs

Junior Data Scientist - Dow Jones - NYCWe are looking for a Junior Analyst to join a specialized data team, focused on growing our subscription business and improving Dow Jones’ core products. The role will require technical expertise in handling large datasets, as well as an obsession with great news products. You will assist in the execution of large scale data projects and support daily reporting on core business functions...

Intro to Keras LayersIn this article, we’ll work through some of the basic principles of deep learning, by discussing the fundamental building blocks in this exciting field. Take a look at some of the primary ingredients of getting started below, and don’t forget to bookmark this page as your Deep Learning cheat sheet!...

Setting up a Spark Cluster on AWSOur goal for today is to build our own cluster with Spark. Fortunately for us, Amazon has made this pretty simple. We’re going to get started by going to AWS...