VEGAS: The Missing Matplotlib for Scala/Apache Spark

In this talk, we’ll present techniques for visualizing large scale machine learning systems in Spark. These are techniques that are employed by Netflix to understand and refine the machine learning models behind Netflix’s famous recommender systems that are used to personalize the Netflix experience for their 99 millions members around the world. Essential to these techniques is Vegas, a new OSS Scala library that aims to be the “missing MatPlotLib” for Spark/Scala. We’ll talk about the design of Vegas and its usage in Scala notebooks to visualize Machine Learning Models.
Session hashtag: #EUds0

Roger works as a Senior Research Engineer at Netflix where he is using large scale Machine Learning algorithms to improve Movies Recommendations for Netflix's 100 M subscribers. Prior to Netflix, he applied Machine Learning and Information Retrieval algorithms to improve the user experience at Yahoo! and Microsoft Bing. Other than ML, he is very interested in distributed computing, having had a brief stint at Amazon AWS.

DB Tsai is an Apache Spark PMC / Committer and an open source and big data engineer at Apple Siri. He implemented several algorithms including linear models with Elastici-Net (L1/L2) regularization using LBFGS/OWL-QN optimizers in Apache Spark. Prior to joining Apple, DB worked on Personalized Recommendation ML Algorithms at Netflix. DB was a Ph.D. candidate in Applied Physics at Stanford University. He holds a Master’s degree in Electrical Engineering from Stanford.

Apache, Apache Spark, Spark, and the Spark logo are trademarks of the Apache Software Foundation.
The Apache Software Foundation has no affiliation with and does not endorse the materials provided at this event.