Join us for the 2nd Big Data Application Meetup

August 17, 2015

Henry Saputra is a Software Engineer at Cask, as well as committer and Project Management Committee (PMC) member to several Apache Software Foundation (ASF) projects such as Flink, Gora, and Twill incubating. Henry also active member of the Apache Incubator Project Management Committee (IPMC) and member of the Apache Software Foundation.

By sponsoring and promoting knowledge-sharing and community-building through the Big Data Application Meetup, Cask continues to take lead in promoting technologies and best practices used to build big data applications.

For the second meetup, we have three exciting speakers with very interesting topics.

First, we have a talk about the use of Apache Kylin (incubating) at eBay. Apache Kylin is an open-source Distributed Analytics Engine—contributed by eBay—that provides a SQL interface and multi-dimensional analysis (OLAP) on Hadoop with support for extremely large data sets. In this talk, Seshu Adunuthulaand Branky Shao from eBay will introduce the concept of “Cube Segments”, which is the ability to build cubes on micro-batches of data subscribed to from Kakfa Topics.

The next talk will be about High Volume Streaming Analytics with CDAP (Cask Data Application Platform) by Jia-long Wu from Lotame. Jia-long will talk about how CDAP helps Lotame solve specific challenges in counting uniques in a high-volume stream processing environment, and presents a novel approach to using time-windowed “HyperLogLog” aggregates. The talk will also discuss how CDAP enabled Lotame to roll out this new platform quickly, and cover valuable lessons and best practices established during the development cycle.

And finally, we have a talk introducing Athena, a stream processing platform for Uber’s near real-time analytics applications presented by Yuanchi Ning from Uber’s Data Engineering team. The talk will cover everything from the tooling built around Samza for easier user on-boarding, integration with typesafe config system, unit test framework, Graphite integration, metric whitelisting, and much more.

Cask will provide food and drinks, and doors open at 6pm to socialize and talk about how awesome building applications for big data is.