Lambda Architecture in the Cloud with Azure Databricks

The term “Lambda Architecture” stands for a generic, scalable and fault-tolerant data processing architecture. As the hyper-scale now offers a various PaaS services for data ingestion, storage and processing, the need for a revised, cloud-native implementation of the lambda architecture is arising.

In this talk we demonstrate the blueprint for such an implementation in Microsoft Azure, with Azure Databricks — a PaaS Spark offering – as a key component. We go back to some core principles of functional programming and link them to the capabilities of Apache Spark for various end-to-end big data analytics scenarios.

We also illustrate the “Lambda architecture in use” and the associated tread-offs using the real customer scenario – Rijksmuseum in Amsterdam – a terabyte-scale Azure-based data platform handles data from 2.500.000 visitors per year.

Andrei Varanovich leads the Data & AI team at InSpark (The Netherlands) where his primary focus is on building cloud-first data solutions on Azure. Passionate about technology, people and professional communities. Earned Microsoft Most Valuable Professional award every year since 2009. Holds a PhD in Computer Science from the University of Koblenz-Landau, Germany.

Apache, Apache Spark, Spark, and the Spark logo are trademarks of the Apache Software Foundation.
The Apache Software Foundation has no affiliation with and does not endorse the materials provided at this event.