Data Management Layer for Machine Learning

HopsML Stockholm

Venue

Our 2nd meetup will have 2 talks. The first is a world first from Stockholm - the first open-source Feature Store for Machine Learning. The talk is by Kim Hammar. Our 2nd talk will be by Stavroula Vassaki from Huawei who organized a 24 hour Hackathon on Deep Learning in December 2018 with over 20 participants.

Agenda:

17:45 Doors Open

18:00 - 18:40: Kim Hammar: Feature Store

18:40 - 19:00: Break with something lighter to eat and drink

19:00 - 19:40: Stavroula Vassaki: Deep Learning Hackathon

19:40 - 20:00: Networking

Title: Feature Store: A Data Management Layer for Machine Learning

Speaker: Kim Hammar

Kim is a software engineer at Logical Clocks AB, the main developers of Hops Hadoop (www.hops.io). He received his MSc in Distributed Systems from KTH in 2018. He has previously worked as an engineer at Ericsson, as a researcher at KTH Royal Institute of Technology, as well as a data scientist at Allstate. He has previously presented at the Web Intelligence Conference (WI 18').

Abstract:

Data may be the new oil, but refined data is the fuel for AI. Machine learning (ML) systems are only as good as the data they are trained on and getting the data in the right format at the right time is a challenge. ML systems are trained using sets of features, a feature can be as simple as the value of a column in a database entry, or it can be a complex value that is computed from diverse sources.

A feature store is a central vault for storing documented and curated features, ideally with support for access control. A feature store enables automatic feature analysis and monitoring, feature sharing across models and teams, feature discovery, feature backfilling, and feature versioning. The feature store is a data management layer that fills an important piece in the modern machine learning infrastructure, it empowers enterprises to scale their machine learning workflows and make full use of their investment in machine learning.

In this talk, we will present key points on how to take your machine learning workflow to the next level using a feature store, and demonstrate how the feature store fits into the larger machine learning pipeline. We will introduce HopsML, an open-source, end-to-end machine learning pipeline built on the world's most fastest and most scalable Hadoop distribution, Hops Hadoop.

Title: Huawei Deep Learning Experience – 24h Hackathon

Speaker: Kim Hammar

Speaker: Stavroula Vassaki, Huawei

Stavroula is an experienced R&D engineer with a demonstrated history of working in the research industry. Skilled in Telecommunications (Radio Resource Management), Optimization and Machine Learning. PhD in Electrical and Computer Engineering from School of Electrical and Computer Engineering, National Technical University of Athens.

Abstract:

On December 11th and 12th 2018, Huawei organized a 24-hour hackathon in collaboration with KTH. Forty-three talented KTH Master students participated in the Deep Learning Challenge to compete for a trip to China and internship opportunities with Huawei. During the 24 hours, the students had to face a semi-supervised learning challenge. More specifically, their goal was to analyze and exploit more than 100k unlabeled samples, provided by Huawei (images) and build a mechanism that is able to recognize accurately the correct class using as few labeled samples as possible. For the supervised learning task, the students were given a small dataset of 5k labeled samples (4.000 for training/1.000 for testing) while they were evaluated in a completely unknown dataset of similar structure that was revealed in the end of the challenge. For the implementation of the challenge, the students were able to write and run their code in Python/PySpark/TensorFlow-Keras/PyTorch, using Hopswork platform that was provided by Logical Clocks. The different solutions that were implemented as well as the challenges/benefits from running a deep learning hackathon will be discussed in the meetup.