Prerequisite knowledge

Basic knowledge of Jupyter and JupyterHub

What you'll learn

Understand the challenges of interactive data analysis in the high energy physics field and the challenges of keeping interactivity of analysis while still offloading computations to external resources

Learn how Jupyter was integrated in the ecosystem of CERN technologies for mass storage, sync and share, and software distribution

Description

Both CERN and high energy physics (HEP) in general face unprecedented challenges in data storage, processing, and analysis. The experiments of the Large Hadron Collider (LHC) are expected to reach one exabyte of physics data this year. After processing and filtering this data, interactivity takes particular importance in the last phases of analysis, where the final results are produced, namely in the form of plots.

Jupyter’s ability to provide notebooks that merge a rich narrative made of code, text, and other media materials allows CERN to offer a web-based service that addresses the needs of the community. This service, called SWAN (an acronym for service for web-based analysis), provides the HEP community with an interactive interface to access data analysis tools, such as the ROOT framework. Moreover, SWAN integrates with CERN’s infrastructure more precisely, with users’ synchronized storage (CERNBox), computing resources, and experiments data and software.

Diogo Castro offers an overview of SWAN and explains how the service is being used by researchers and students, both inside and outside CERN. Diogo also discusses the evolution of the service, especially the new SWAN interface, developed on top of Jupyter, which enables both easy sharing among users and connecting to Spark clusters.

Diogo Castro

CERN

Diogo Castro is a full stack developer on the SWAN team within the Software Development for Experiments Group at CERN.