Prerequisite knowledge

Attendees should be familiar with distributed computing and architecture, basic analytics, NoSQL, and general fault tolerance and have a general understanding of Apache Spark, Cassandra, and Kafka.

Description

In today’s world of exploding big and fast data, developers who want both streaming analytics and ad hoc, OLAP-like analysis have often had to develop complex architectures such as Lambda—a path for fast streaming analytics using NoSQL stores such as Cassandra and HBase with a separate batch path involving HDFS and Parquet. While this approach works, it involves too many moving parts, too many technologies for ops, and too many engineering hours. Helena Edelson and Evan Chan highlight a much simpler approach to combine streaming and ad hoc/batch analysis using what they call the NoLambda stack (Apache Spark/Scala, Mesos, Akka, Cassandra, Kafka) plus FiloDB, a new entrant to the distributed-database world, which combines streaming and ad hoc analytics.

Helena Edelson

Apple

Committer to several open source projects including the Spark Cassandra Connector, Cassandra Kafka Connector, a previous contributor to Akka (2 new features in Akka Cluster), Spring Integration and several others. She is also a speaker at international Big Data and Scala conferences: Kafka Summit, Spark Summit (EU and NYC), Strata (NYC and San Jose), Reactive Summit, QCon SF, Scala Days (EU and US), Scala World and Philly Emerging Technology. Currently a Senior Software Engineer in Distributed Systems at Apple.

Evan Chan

Tuplejump

Evan Chan is a distinguished software engineer at Tuplejump. Evan loves to design, build, and improve bleeding-edge distributed data and backend systems using the latest open source technologies. He has led the design and implementation of multiple big data platforms based on Storm, Spark, Kafka, Cassandra, and Scala/Akka, including a columnar real-time distributed query engine. Evan is an active contributor to the Apache Spark project, a DataStax Cassandra MVP, and cocreator and maintainer of the open source Spark Job Server. He is a big believer in GitHub, open source, and meetups and has given talks at various conferences, including Spark Summit, Cassandra Summit, FOSS4G, and Scala Days.

Apache Hadoop, Hadoop, Apache Spark, Spark, and Apache are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries, and are used with permission. The Apache Software Foundation has no affiliation with and does not endorse, or review the materials provided at this event, which is managed by O'Reilly Media and/or Cloudera.