Sign up or log in to save this to your schedule and see who's attending!

Apache Spark is well-known as a powerful platform to perform iterative computations required for ML. This talk presents how to combine the strengths of Spark’s ML library (MLlib) with popular packages such as scikit-learn and TensorFlow. Scikit-learn is the de facto standard ML library for Python, and TensorFlow is a library for deep learning recently open-sourced by Google.

We also discuss the improvements of MLlib in Spark 2.0 and the future of MLlib’s APIs. On the roadmap are both more algorithms and features for users, and more utilities and abstractions to aid developers.

Tim Hunter is a software engineer at Databricks and contributes to the Spark MLlib project. He has been building distributed Machine Learning systems with Spark since version 0.5, before Spark was an Apache Software Foundation project.