Tools and Infrastructure for Reproducibility in Data-intensive Applications

Meaningful advances in science and engineering are increasingly predicated on data-driven
decision making. For these decisions to be valid, it is essential that the one not only record the
process by which the data was produced, but to actually be able to reproduce the data at every
step in the process. While we are all used to tracking source code revisions, and keeping track
of program inputs and outputs, the increased complexity of end-to-end computing pipelines,
coupled with new big-data and machine learning algorithms, imposes significant complexity on
tracking all of the steps and associated data that went into producing a result. For example,
keeping track of exactly what data use used for a training set v.s an evaluation set, what
cleaning was done, what analysis was done on the results to evaluate performance, and what
additional experiments were performed. With the every increasing number, size, and complexity
of the data used in data-intensive applications, reproducing results from these types of
investigations becomes increasingly difficult. While no-one deliberately sets out to create
un-reproducible results, recent surveys of the literature shows that the ability to reproduce
data-intensive results are the exception and not the rule.
For these reasons, a symposium on issues, tools and infrastructure for data intensive
applications is highly germane to the ParCO community.
In this symposium, we propose to review current state of the art in reproducibility in data
intensive computing applications. We will cover three primary topic areas:

Reproducibility challenges that are specific to science, engineering activities that have
data intensive computing as a core aspect of the process.

Infrastructure, tools and methods that are currently available for reproducible data
intensive applications and gaps and challenges that need to be addressed.

How to increase the adoption of methods for reproducible data-intensive applications
across the research community.

ParaFPGA2019: Parallel Computing with FPGAs

ParaFPGA focuses on parallel techniques for using FPGAs as accelerator in high performance
computing areas such as supercomputing, embedded systems and big data computing. Of special interest
are design methods, heterogeneous architectures and algorithms optimized for execution on FPGAs.
Design methods include optimizing the resource utilization, development time and high-level synthesis
tools.

Heterogeneous architectures aim at multi-FPGAs, FPGAs with CPU cores and systems combining FPGAs,
GPUs and CPUs. Novel algorithms optimized for FPGAs target areas such as streaming applications or fast
dynamic reconfiguration, featuring a substantial performance increase. New and original contributions
are invited on parallel computing with FPGAs in all areas of design methodology, performance analysis,
architectures, algorithms and applications. The CFP and submission details are available on
http://parafpga.elis.ugent.be.

The power requirements of large HPC facilities are becoming unsustainable for both technical and economic
reasons. A significant fraction of the total cost of ownership of HPC installations available nowadays is
already driven by the electricity bill, and the idea of charging users for the energy consumed by their
applications is spreading.

This requires to find disruptive, smart and effective solutions, both at level of hardware and software to
maximize energy-efficiency and computing throughput of HPC systems within a given power envelop.

Several recently developed processors, such multi- and many-cores CPUs, GPUs, FPGAs and SoCs (System on Chip)
are engineered including features to improve computing energy-efficiency, e.g. higher level of parallelism;
dynamic clock frequency adaptation; capability of switching off idle cores, etc. However, programming
complexity and code portability can be affected by these choices, and computing efficiency (as well as
energy-efficiency) can be disrupted if applications are not properly coded to exploit (or take into account)
all of the hardware features, in the worst scenario, possibly increasing the time-to-solution and the
energy-to-solution. In addition, the heterogeneity of modern compute nodes and systems, combined with the
dynamic behavior of modern applications, require educated resource allocation and resource management
techniques to achieve energy efficiency.

This workshop aims to strongly encourage the exchange of experiences and knowledge in novel strategies to
exploit, monitor, analyze, and optimize the energy-efficiency of recent computing systems. We focus on new
trends including, hardware and software tools, scheduling and resource management techniques but also
algorithm-design and techniques in general, able to minimize the energy-to-solution of workloads, and to
reduce the energy required to operate computing systems.

The Symposium is devoted to the ELPA project - an important successful parallel eigensolver for dense symmetric (Hermitian) generalized eigenvalue problems.
The proposed talks will cover mathematical, algorithmical, implementational, and use-oriented aspects.

First, the mathematical approach is presented for setting up algorithms for the eigenproblem that allow efficient solution in a parallel environment, esp. the
two-step approach for transforming the matrix in tridiagonal form. Furthermore, implementational details, also for the special case of banded matrices in parallel, are presented.
Then we will deal with improving the usability of the software library, esp. monitoring and autotuning tools, and optimizing the software for different parallel architectures.
Finally, the application of the solver in Computational Chemistry is presented.

Call for Symposia

AIMS AND SCOPE

ParCo2019 will cover the state of the art in the development, applications, and future trends in parallel
computing for both High Performance (HPC) and Data Intensive Computing (DIC). Its scope encompasses all platforms, from Internet of
Things (IoT) and Robotics to HPC systems, Clouds, Quantum, and Neuro-Computing.

The conference addresses all aspects of parallel computing, including applications, hardware and software
technologies, and languages and development environments.

TOPIC AREAS

Section 1: Architectures

New concepts for parallel computing architectures for all levels of parallelism, including:

Multicore and manycore systems

Heterogeneous systems

Accelerators, including GPUs and FPGAs

High-performance systems, including peta- and exascale

Architectures for handling large data sets and data intensive computing, including high speed storage systems

by 31 March 2019, but later
proposals will be considered. For any questions about the organisation of a symposium please contact the conference office.

Selection/Acceptance

The symposium organiser(s) is/are responsible for the selection of papers to be presented. The submission date as well as the
format of papers required are determined by the organisers.

The Titles, Authors and short Abstracts of all accepted presentations
must be made available to the Conference Office by 31 July 2019. Note that in the case of late submissions the relevant data
should be made available as soon as possible before the start of the conference.

Presentation of Papers

A session for the presentation of the papers will be reserved in the conference program by the Program Committee.
A projector for the projection of slides with a notebook or mobile equipment will be available. Organisers should communicate any
special wishes to the Conference Office.

Publication of Symposium Papers

Papers presented in a mini-symposium may be included in the conference proceedings. These are published after the conference.
Papers will be grouped together under the title of the Mini-Symposium, allowing for an Introduction by the organiser(s).
The choice of publication rests with the organiser(s) of a mini-symposium. Papers are allowed the same number of pages as all other
papers included in the proceedings, i.e. 10 pages. Additional pages may be purchased by contacting the Conference Office.

The organiser(s) is/are responsible for the refereeing and acceptance of final versions of papers to be included in the proceedings.
The final submission date for accepted papers to be included in the conference proceedings is 31 October 2019.