Join Us

Flexible, fast and friendly, Spark is the biggest big data analytics distributed processing framework for scale. This Apache Spark course will help you to take advantage of the platform with its native Scala binding to write data-centric applications which perform fast. Learn how to implement...

Constructing real-time production systems is easier than ever, with new use cases enabled by new big data frameworks, and complexity and cost cut by the cloud. But big data processing comes with some problems. This two-day intensive course will teach you to build systems which can contend with...

Two days in London

Machine learning is having a dramatic impact on all industries and improving productivity at an exponential rate. Big data is transforming almost every aspect of science and humanities, driven by the emergence of a data society. Together, data science and machine learning are the driving forces...

Data processing paradigms are undergoing a paradigm shift as you move more and more towards real time processing. Emerging software models such as the SMACK stack are at the forefront of this change, focused on a pipeline processing model, but are also introducing new levels of operational...

Two days in London

Join us at Scala eXchange London 2019, Europe's largest gathering of Scala engineers, to discover where Scala is headed in 2020 and to meet, learn and share skills with 1,000+ other passionate Scala developers.

In this talk, Matei Zaharia covers the basics of Spark, sketch some of these applications, and discuss new features we are developing, including SQL on Spark (Shark) and capabilities for stream processing

A look at the internals of the Cassandra Spark Connector and its architecture. How Spark co-operates with Cassandra. Best practice in terms of data retrieval and manipulation and some tuning tips to ensure you don't trip up when getting started or scaling out.

For the 2016 Roskilde Festival, IBM was part of the Festival's Big Data team. Why did the Roskilde organisation want a Big Data Team? What did we learn? How was Spark involved? This session explores an innovative use of Spark and some surprising conclusions.

The nature of internet data processing is changing. Sensors. IoT devices. Online ecommerce. Social networks. All these are delivering a continuous flow of information that has to be processed and acted upon quickly. At the same time these problems require solutions that are also scalable and...

There has been an explosion of interest in Apache Spark as a new, alternative computing paradigm for Hadoop. Spark provides a faster and more general data processing platform. Spark lets you run programs up to 100x faster in memory, or 10x faster on disk, than Hadoop. It offers something to...