Issue #76

May 7 2015

Editor Picks

Machine Learning for Emoji TrendsIn October 2011, Apple added the emoji keyboard to iOS as an international keyboard. Since then, digital language has evolved such that nearly half of comments and captions on Instagram contain emoji characters. And earlier this week, Instagram also added support for emoji characters in hashtags, which allows people to tag and search content with their favorite emoji #. In Part 1 of this blog post series, we will take a deep dive into emoji usage on Instagram. By applying machine learning and natural language processing techniques, we’ll discover the hidden semantics of emoji...

How we’re using Machine Learning to fight shell sellingIn this first in an occasional series, we’re taking a look at machine learning initiatives at WePay — the kinds of problems we use machine learning for, how we build technology to address them, and how the unique challenges of the payments industry shape our approach. We thought the best introduction would to be to look at an actual fraud problem we face, shell selling, and how we built the algorithm we’re now using to solve it...

The Simple, Elegant Algorithm That Makes Google Maps Possible​ Algorithms are a science of cleverness. A natural manifestation of logical reasoning—​mathematical induction, in particular—a good algorithm is like a fleeting, damning snapshot into the very soul of a problem. A jungle of properties and relationships becomes a simple recurrence relation, a single-line recursive step producing boundless chaos and complexity. And to see through deep complexity, it takes cleverness...

Data Science Articles & Videos

Predictive Machine Learning — Behind The Scenes at FliptopAt Fliptop, our data science team uses machine learning to create a range of predictive models. The Fliptop predictive platform, or as we call it, Darwin, automatically builds an array of machine-learning models and applies dozens of statistical measures to determine which model is most predictive at each stage of the marketing and sales lifecycle. In this post, we’ll offer a glimpse into how Darwin works, focusing specifically on the predictive lead scoring component of our platform. From there, we’ll offer a few predictions of our own about where we see marketing technology going in coming years...

Parallel Machine Learning with Hogwild!In this blog post, I will explain what stochastic gradient descent (SGD) is and how thread locking has a very large effect on performance. I will attempt to explain how parallel algorithms for machine learning such as Hogwild! work, why they have transformed big data analytics, and how GraphLab Create not only adopts these techniques but also actively pushes the frontier of parallel machine learning algorithms...

Deep Learning for Decision Making and ControlA remarkable feature of human and animal intelligence is the ability to autonomously acquire new behaviors. This research is concerned with designing algorithms that aim to bring this ability to robots and simulated characters. In this talk, Levine will describe a class of guided policy search algorithms that tackle this challenge by transforming the task of learning control policies into a supervised learning problem, with supervision provided by simple, efficient trajectory-centric methods...

Image Scaling using Deep Convolutional Neural NetworksThis past summer I interned at Flipboard in Palo Alto, California. I worked on machine learning based problems, one of which was Image Upscaling. This post will show some preliminary results, discuss our model and its possible applications to Flipboard’s products...

Why Topological Data Analysis Works
Topological data analysis has been very successful in discovering information in many large and complex data sets. In this post, I would like to discuss the reasons why it is an effective methodology...

How I Became Chief Data Scientist
I’m the U.S. Chief Data Scientist — and I got my start in community college. Yes, I’ve got a Ph.D. in applied mathematics, have been fortunate to help build amazing companies, and have been at the forefront of the data science movement. But the critical first step in that journey started at De Anza Jr. College in Cupertino, California...

Jobs

EA is seeking a Data Scientist for its Red Crow studio, reporting to the studio Product Manager. We are looking for a professional with advanced statistical analysis skills. This role requires a passion for understanding players’ behavior through data, high attention to detail and data integrity... focus on analyzing large sets of data surrounding acquisition, engagement, and monetization and helping to automate this process for game teams, analysts and product managers.
The individual should have a desire for data mining, scripting, problem solving and statistical analysis. This person will preferably have a strong interest in gaming (specifically mobile or social) or a fast paced company where data is core to its operations. ...

Training & Resources

10 Common NLP Terms Explained for the Text Analysis Novice
If you’re relatively new to the NLP and Text Analysis world, you’ll more than likely have come across some pretty technical terms and acronyms, that are challenging to get your head around, especially, if you’re relying on scientific definitions for a plain and simple explanation. We decided to put together a list of 10 common terms in Natural Language Processing which we’ve broken down in layman terms, making them easier to understand..