Alex Dean will talk about building Snowplow, an open source event analytics platform, on top of Scala and key libraries and frameworks including Scalding, Scalaz and Spray. He will highlight some of the data processing tricks and techniques picked up along the way, particularly: schema-first development; monadic ETL; datatable-based testing; data transformation maps. He will also introduce some of the Scala libraries the Snowplow team have open sourced along the way (such as scala-forex, referer-parser, scala-maxmind-geoip).

Building data processing applications in Scala: the Snowplow experience

I'm the co-founder and tech lead at Snowplow Analytics, the open source web and event analytics platform (https://github.com/snowplow/snowplow). Snowplow is almost exclusively written in Scala, using a range of technologies including Scalaz, Scalding and Spray. I spend a lot of time working with distributed systems (historically Hadoop, increasingly Kinesis, Kafka et al) to deliver really scalable event stream processing. I'm also the author of Unified Log Processing from Manning Publications (http://manning.com/dean/).

Alex Dean will talk about building Snowplow, an open source event analytics platform, on top of Scala and key libraries and frameworks including Scalding, Scalaz and Spray. He will highlight some of the data processing tricks and techniques picked up along the way, particularly: schema-first development; monadic ETL; datatable-based testing; data transformation maps. He will also introduce some of the Scala libraries the Snowplow team have open sourced along the way (such as scala-forex, referer-parser, scala-maxmind-geoip).

Building data processing applications in Scala: the Snowplow experience

I'm the co-founder and tech lead at Snowplow Analytics, the open source web and event analytics platform (https://github.com/snowplow/snowplow). Snowplow is almost exclusively written in Scala, using a range of technologies including Scalaz, Scalding and Spray. I spend a lot of time working with distributed systems (historically Hadoop, increasingly Kinesis, Kafka et al) to deliver really scalable event stream processing. I'm also the author of Unified Log Processing from Manning Publications (http://manning.com/dean/).