Cameron Joannidis

Specialises In
(based on submitted proposals)

Principal Machine Learning Engineer working at the nexus of machine learning, big data and functional programming at Simple Machines - an Australian consultancy specialising in end to end data systems from engineering and devops through to machine learning and analytics.

No submissions found.

keyboard_arrow_down

Building a Centralised Machine Learning Pipeline with Spark and Kafka

schedule 7 months ago

Sold Out!

50 Mins

Talk

Intermediate

Many organisations face the difficult challenge of enabling Machine Learning projects to get to market more quickly and to allow data science teams to share their features. In this talk, I will be discussing the machine learning pipeline developed at a large Australian telecommunications company to achieve this goal using Kafka and Spark as well as the challenges faced along the way. I’ll begin by discussing the utility and motivation for a centralised feature store, before looking at the complexities of such an undertaking (both technical and organisational). We will then dig into the technical details of implementation by discussing the scalability headaches we faced and dive into the details of the solutions used to drastically improve the speed and organisational scalability of the system. Several areas that will be covered are providing a declarative API that allowed us to compile feature definitions into optimised spark code, the complexity of a true streaming dedupe, adjusting the workflow for different machine learning use cases, fine-tuning the resource allocation to avoid unnecessary bottlenecks and allowing for streaming and batch data sources. Finally, we will touch on lessons learnt along the way and offer advice on things to avoid as well as how to take things to the next stage.

Building a Centralised Machine Learning Pipeline with Spark and Kafka

schedule 7 months ago

Sold Out!

50 Mins

Talk

Intermediate

Many organisations face the difficult challenge of enabling Machine Learning projects to get to market more quickly and to allow data science teams to share their features. In this talk, I will be discussing the machine learning pipeline developed at a large Australian telecommunications company to achieve this goal using Kafka and Spark as well as the challenges faced along the way. I’ll begin by discussing the utility and motivation for a centralised feature store, before looking at the complexities of such an undertaking (both technical and organisational). We will then dig into the technical details of implementation by discussing the scalability headaches we faced and dive into the details of the solutions used to drastically improve the speed and organisational scalability of the system. Several areas that will be covered are providing a declarative API that allowed us to compile feature definitions into optimised spark code, the complexity of a true streaming dedupe, adjusting the workflow for different machine learning use cases, fine-tuning the resource allocation to avoid unnecessary bottlenecks and allowing for streaming and batch data sources. Finally, we will touch on lessons learnt along the way and offer advice on things to avoid as well as how to take things to the next stage.

Building a centralised Machine Learning Pipeline with Spark and Kafka

schedule 9 months ago

Sold Out!

50 Mins

Talk

Intermediate

Many organisations face the difficult challenge of enabling Machine Learning projects to get to market more quickly and to allow data science teams to share their features. In this talk, I will be discussing the machine learning pipeline developed at a large Australian telecommunications company to achieve this goal using Kafka and Spark as well as the challenges faced along the way. I’ll begin by discussing the utility and motivation for a centralised feature store, before looking at the complexities of such an undertaking (both technical and organisational). We will then dig into the technical details of implementation by discussing the scalability headaches we faced and dive into the details of the solutions used to drastically improve the speed and organisational scalability of the system. Several areas that will be covered are providing a declarative API that allowed us to compile feature definitions into optimised spark code, the complexity of a true streaming dedupe, adjusting the workflow for different machine learning use cases, fine tuning the resource allocation to avoid unnecessary bottlenecks and allowing for streaming and batch data sources. Finally we will touch on lessons learnt along the way and offer advice on things to avoid as well as how to take things to the next stage.

An Intuitive Guide to Combining Free Monad and Free Applicative

schedule 1 year ago

Sold Out!

30 Mins

Talk

Intermediate

The usage of Free Monads is becoming more well understood, however the lesser known Free Applicative is still somewhat of a mystery to the average functional programmer. In this talk I will explain how you can combine the power of both these constructs in an intuitive and visual manner. You will learn the motivations for using Free Structures in the first place, how we can build up a complex domain, how we can introduce parallelism into our domain and a bunch of other practical tips for designing programs with these structures. This will also give you a deeper understanding of what libraries like Freestyle are doing under the hood and why it is so powerful.

Machine Learning Systems for Engineers

schedule 1 year ago

Sold Out!

30 Mins

Talk

Intermediate

Machine Learning is often discussed in the context of data science, but little attention is given to the complexities of engineering production ready ML systems. This talk will explore some of the important challenges and provide advice on solutions to these problems.