12 Oncology Runtime code expected to take 400 days, optimisation reduced this to 130 days but still too long Need a parallel code Memory Impossible to fit all the data into memory However, we only actually need 5% of the results Scaling 2D decomposition used with a task farm More chunks than processors Sorting Parallel sorting algorithm used Computed interactions between all pairs of markers 565,000 2 computations Runtime reduced from 400 days to 5 hours on 512 CPUs on HECToR 8.5x10 9 (192GB) probability values obtained Sorting performed in 5 minutes HPC and Big Data 12

13 Square Kilometre Array (SKA) Largest and most sensitive radio telescope in the world to be built in South Africa and Australia 3000 dishes 1 EB data generated per day HPC and Big Data 13

14 Facilities: EDIM1 A machine for Data Intensive Research Commissioned by EPCC & Informatics Designed for I/O-intensive applications Use commodity components Combine them in a novel way Use cheap low-power processors HPC and Big Data 14

20 UK Research Data Facility RDF consists of 7.8PB disk 19.5 PB backup tape Provide a high capacity robust file store; Persistent infrastructure - will last beyond any one national service; Will remove end of service data issues - transfers at end of services have become increasingly lengthy; Will also ensure that data from the current HECToR service is secured - this will ensure a degree of soft landing if there is ever a gap in National Services; RDF is designed for long term data storage Currently only open to HECToR users HPC and Big Data 20

Performance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems Rekha Singhal and Gabriele Pacciucci * Other names and brands may be claimed as the property of others. Lustre File

GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems

Kriterien für ein PetaFlop System Rainer Keller, HLRS :: :: :: Context: Organizational HLRS is one of the three national supercomputing centers in Germany. The national supercomputing centers are working

School of Systems Engineering A Novel Cloud Based Elastic Framework for Big Data Preprocessing Omer Dawelbeit and Rachel McCrindle October 21, 2014 University of Reading 2008 www.reading.ac.uk Overview

Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or

Benchmarks and Comparisons of Performance for Data Intensive Research Saad A. Alowayyed August 23, 2012 MSc in High Performance Computing The University of Edinburgh Year of Presentation: 2012 Abstract

Bringing Big Data Modelling into the Hands of Domain Experts David Willingham Senior Application Engineer MathWorks david.willingham@mathworks.com.au 2015 The MathWorks, Inc. 1 Data is the sword of the

Big Systems, Big Data When considering Big Distributed Systems, it can be noted that a major concern is dealing with data, and in particular, Big Data Have general data issues (such as latency, availability,

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Dr. Liangxiu Han Future Networks and Distributed Systems Group (FUNDS) School of Computing, Mathematics and Digital Technology,

Cloud Computing Where ISR Data Will Go for Exploitation 22 September 2009 Albert Reuther, Jeremy Kepner, Peter Michaleas, William Smith This work is sponsored by the Department of the Air Force under Air

THE SUN STORAGE AND ARCHIVE SOLUTION FOR HPC The Right Data, in the Right Place, at the Right Time José Martins Storage Practice Sun Microsystems 1 Agenda Sun s strategy and commitment to the HPC or technical

CMS Tier-3 cluster at NISER Dr. Tania Moulik What and why? Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach common goal. Grids tend

ALPS Supercomputing System A Scalable Supercomputer with Flexible Services 1 Abstract Supercomputing is moving from the realm of abstract to mainstream with more and more applications and research being