Workshops

Machine Learning and predictive analytics for the Enterprise

SIMONE SCARDAPANE

The workshop will introduce predictive analytics and machine learning tools in Python. We will show how to implement a complete ML workflow, including data preprocessing, model selection, and evaluation. Appropriate libraries for handling big data situations (e.g., Spark) will also be discussed.

DURATION
The workshop is full-day (8 hours) from 9:00 to 18:00, with one hour lunch break.

CHECK IN: 8:30 – 9:00

PRICES:
Every 8 hours workshop’s ticket is fixed:
– to 130 € from the beginning of the sales till the 26th of January;
– to 160 € from the 27th of January till the 23rd of February;
– to 190 € from the 24th of February till the 14th of March;
– to 220 € from the 15th of March till the 15th of March till the end of sales.

SIMONE SCARDAPANE
Simone Scardapane is a post-doc fellow at Sapienza University (Rome) and an honorary fellow at the University of Stirling (UK). His research is focused on machine learning, with an emphasis on deep learning, distributed environments, and applications in the audio field. Before the PhD, he obtained a B.Sc. in Computer Engineering in 2009, and a M.Sc. in Artificial Intelligence and Robotics in 2011. He is an active member of several organizations, including the IEEE Computational Intelligence Society, the International Neural Networks Society, and the AI*IA.

ABSTRACT
The workshop will introduce predictive analytics and machine learning tools in Python. Using two realistic use cases, we will discuss all steps for implementing a complete ML workflow in enterprise, both theoretically and practically. These include data preprocessing, model selection, evaluation, and lifelong learning. All topics will be shown in practice by introducing several Python libraries, including pandas (for data loading), seaborn (for visualization), and scikit-learn (for ML). Additionally, we will introduce big data tools such as Spark for the design of predictive models.

TABLE OF CONTENTS
1- The workshop will begin by introducing NumPy, a fundamental library for handling large numerical arrays in Python, which is used extensively in all other libraries.
2- Next, we will introduced some basic scientific libraries for plotting (matplotlib), and for performing basic statistical analysis (SciPy).
3- A complete workflow for classification using scikit-learn will be discussed.
4- We will discuss additional functionalities of scikit-learn, including clustering, data visualization, and pipelining.
5- We will show how to extend the ML workflow in a Spark environment with the MLLib library.
6- Additional topics will be introduced, including the handling of time-series data for prediction, advanced visualization using seaborn, processing of complex data using pandas, and how to port ML models in a mobile environment.

TRAINING OBJECTIVES
At the end of the workshop, participants will be able to design relatively standard machine learning models in their applications for different tasks such as automatic prediction, clustering, and high-dimensional data visualization. They will understand the Python ecosystem of libraries, including the use of distributed frameworks such as Spark. Fundamental theoretical concepts will also be introduced during the workshop.

WHO THE WORKSHOP IS DEDICATED TO?
The workshop is open to everyone interested in understanding the fundamentals of ML and predictive analytics for the enterprise, with an emphasis on Python libraries. The workshop will introduce both theoretical and practical concepts, together with their concrete implementation.

PREREQUISITES NEEDED FROM ATTENDEES
No prior exposure to machine learning or to numerical computation in Python is requested, as all the necessary concepts will be introduced when necessary. A basic knowledge of linear algebra and/or optimization is enough to navigate throughout the entire workshop.

HARDWARE AND SOFTWARE REQUIREMENTS
Partecipants need to bring their own laptop.
A standard laptop is enough to replicate all code provided in the workshop. In order to have a working installation of Python with all the required tools, participants can install the Anaconda distribution (https://www.continuum.io/downloads) or a similar full-stack solution. In order to run Spark, participants can download previously the latest release from the website (https://spark.apache.org/), and will be shown how to run it in a standalone fashion during the workshop.

WARNING:Seats are limited.The workshop will be held only if the minimun number of attendees is reached.