Cross-cutting Areas and Systems

CyberInfrastructure (Cloud, Computation, Networks)

CI, human and machine, is where ideas and fundamentals of data science get implemented in practical systems used by scientific communities to derive knowledge from data.This is an integration heavy area that could benefit/relate to many others.

Telecommunication Networks and Wireless Systems

Traditional approaches to traffic engineering and network deployments rely on generic modelling assumptions and rule of thumb over provisioning. Future generation systems, such as 5G systems, aspire to network vastly larger variety of devices to support highly diverse applications. The design and operation of these expensive, complex interconnected systems will be increasinglydata driven and can benefit from advances in machine learning algorithms. Our goals in this regard include (1) creation of datasets to capture city scale data traffic and mobility patterns and (2) algorithms to infer numerous measures of value to network designers and operators as well as multiple disciplines, including public health, mental health, environment, transportation and energy usage.

Data-Driven System Design

We focus on the use of data science to reduce the difficulty and cost of complex system design processes that today require thousands of engineers and years of schedule. Goals include (1) modeling and prediction of design tool behaviors and outcomes, (2) discovery of appropriate optimization objectives to be applied at given stages of a design process, and (3) enabling accurate model-guided exploration of extremely large design (and design process) solution spaces.

Data Visualization, Sonification, Interactive Spaces (AR/VR)

Activity Centered Visualization Extracting salient representations from data is important for variety of scientific as well as practical and entertainment applications. Such representation may help humans make sense of complex information, and also allow machines to operate on such data in ways that is meaningful for humans. Equally, sonification may offer new insights into time-based data by relying on human abilities to detect patterns in audio. The variety of tools to be developed in this cluster will offer new ways of interaction with the machine both in real and virtual situations, and allow coexistence and cooperation of humans and machines in joint augmented reality spaces.

Scientific Workflows and Process Management

Instruments (and possibly simulations as well) across all scientific disciplines are a source of peta to exabyte scale data collections that need to be reconstructed, calibrated, validated, transferred, curated, made available, and ultimately analyzed by hundreds to thousands of scientists worldwide. There are a myriad of scientific workflow questions to be solved in the data life cycle from bytes coming out of instruments to scientific knowledge flowing from the collected data. Many of the fundamental challenges are common across disciplines. Ideas, solutions, and maybe even technologies may thus be shared. Some of these shared technologies may themselves involve data analytics to design, automate, and/or optimize workflows and processes.

Data Science in Art

Understanding how meaning is made with new methods of representation is the ongoing undertaking of contemporary art. With new ways of characterizing the self and the world through data science, insight will come from the the creation of new experiential and syntactical modes.

Neuromorphic Engineering

Neuromorphic Silicon Learning Machines Learning and adaptation are key to natural and artificial intelligence in complex and variable environments. Neural computation and communication in the brain are partitioned into the grey matter of dense local synaptic connectivity in tightly knit neuronal networks, and the white matter of sparse long-range connectivity over axonal fiber bundles across distant brain regions. This exquisite distributed multiscale organization provides inspiration to the design of scalable neuromorphic systems for deep learning and inference, with hierarchical address event-routing of neural spike events and multiscale synaptic connectivity and plasticity, and their efficient implementation in low-power mixed-signal very-large-scale-integrated (VLSI) circuits. Advances in machine learning and system-on-chip VLSI have led to the development of massively parallel silicon learning machines with pervasive real-time adaptive intelligence that begin to approach the efficacy and resilience of biological neural systems, and already exceed the nominal energy efficiency of synaptic transmission in the mammalian brain.