Issue #60

Jan 15 2015

Editor Picks

R is still hot, and getting hotterIt's been more than four years since I wrote the white paper R is Hot with the goal of introducing R to companies who need modern and flexible data analysis software. It's still the most-downloaded whitepaper on the Revolution Analytics website. But a lot has changed in the past four years: R's popularity has grown, and more and more companies are adopting R for various applications. So I decided to update the paper with the latest statistics on R usage, and even more examples of how R is used in practice...

Data Science Articles & Videos

Machine learning for fraud detection (at Stripe)Using data from across the Stripe network, we’ve developed a machine learning system that evaluates charges in real-time and blocks those that are almost certainly fraudulent. By analyzing hundreds of different characteristics pertaining to each payment, these algorithms have already shielded businesses on Stripe from millions of attempted fraudulent charges...

Deep Image: Scaling up Image Recognition
We present a state-of-the-art image recognition system, Deep Image, developed using end-to-end deep learning. On one of the most challenging computer vision benchmarks, the ImageNet classification challenge, our system has achieved the best result to date, with a top-5 error rate of 5.98% - a relative 10.2% improvement over the previous best results...

Taste and TrustA promise of social networking platforms is their ability to leverage your social network to provide useful recommendations. We are overwhelmed by the choices available to us, and historically we've relied on our social environment to help us navigate them. Yes, there’s a risk that personalizing our experiences based on our social networks will trap us in a filter bubble. But that’s not my biggest concern. I’m interested in the roles that taste and trust play in making recommendations useful...

Game Theorists Crack Poker
An 'essentially unbeatable' algorithm for the popular card game points to strategies for solving real-life problems without having complete information...

Facial Keypoints Detection
Our [Kaggle's] friend Daniel Nouri, founder of Natural Vision and top contender on the leaderboard, has written a tutorial blog post on this competition. The tutorial outlines how to use convolutional neural nets to detect facial key points on this competition's dataset...

Jobs

Lending Club is the world’s largest online marketplace connecting borrowers and investors... looking to hire a big data scientist. Data, and lots of it, is at the core of Lending Club’s business. We use our rapidly growing dataset to understand the market, make credit decisions, predict performance, optimize ROI, and define product strategy. We need a scientist who is excited about helping us build our next generation big data analytics platform that runs on Hadoop...

Training & Resources

Introducing practical and robust anomaly detection in a time seriesBoth last year and this year, we saw a spike in the number of photos uploaded to Twitter on Christmas Eve, Christmas and New Year’s Eve (in other words, an anomaly occurred in the corresponding time series). Today, we’re announcing AnomalyDetection, our open-source R package that automatically detects anomalies like these in big data in a practical and robust way...

ggplot2 1.0.0As you might have noticed, ggplot2 recently turned 1.0.0. This release incorporated a handful of new features and bug fixes, but most importantly reflects that ggplot2 is now a mature plotting system and it will not change significantly in the future...