This event follows the NumFOCUS code of conduct , please familiarise yourself with it before the event. Please get in touch with the organisers with any questions or concerns regarding the code of conduct.

We are issuing tickets via a lottery - if you want to be in with a chance of a place - sign up for the waitlist! The lottery will be run approx 1 week before the meetup, and we will re-run the lottery to fill any spaces that free up or use the waitlist towards the time of the event.

Main Talks--------------

Franz Kiraly - Supervised learning with skpro

A hands-on session in building and assessing predictive models that also produce uncertainty estimates of a their prediction, such as: predictive intervals, and fully probabilistic supervised predictions including Bayesian predictive posteriors. Starting with a gentle theoretical overview introduction on what are, how to make, and how to evaluate probabilistic classification/regressio­n models, followed by a demo of the skpro package, "scikit-learn for probabilistic supervised learning".

Bring a laptop with Python >= 3.6 and Jupyter installed if you'd like to follow along.

The session will also feature a short overview of python modelling toolboxes currently in development at the Alan Turing Institute, and how to get involved.

Machine Learning (ML) pipelines are the fundamental building block for productionizing ML code. However, existing tutorials and educational material in Python for Data Scientists emphasizes ad-hoc feature engineering and training pipelines to experiment with ML models. Such pipelines have a tendency to become complex over time and do not allow features to be easily re-used between different ML pipelines. Features used for training and serving may have different implementations that are not consistent.

In this talk, we will show how ML pipelines can be programmed, end-to-end, in Python. We will show how a Feature Store can provide a natural interface between Data Engineers, who create reusable features from diverse data sources, and Data Scientists, who experiment with predictive models, built from the same features. In an example end-to-end pipeline in Python, we will show how Python dampens the impedance mismatch between Data Engineering and Data Science, enabling Python to become the only language needed for ML pipelines.

Logistics-------------Doors open at 6.30pm (get there early as you have to sign-in via AHL's security), talks start at 7 pm, drinks from 9 pm in the bar. We normally have >200 folks in the room so there's plenty of people to discuss data science questions with!

Please unRSVP in good time if you realize you can't make it. We're limited by building security on the number of attendees, so please free up your place for your fellow community members!