Spark Streaming at Bing Scale

Hundreds of millions of search queries hit Bing.com every day and generate massive volume of logs and signals that need to be collected, processed and enriched in near real-time to monitor the quality of service, analyze user engagement and act upon revenue opportunities in a timely manner. We have employed Apache Spark Streaming to implement the data processing pipeline for this scenario and running it in production. In this talk, we will cover the following: (1) architecture of the stream processing pipeline (2) key challenges and lessons learned in building streaming data pipelines using Spark and Kafka

Apache, Apache Spark, Spark, and the Spark logo are trademarks of the Apache Software Foundation.
The Apache Software Foundation has no affiliation with and does not endorse the materials provided at this event.