2 Squeezing Information from Temporal Spatial Datasets Leverage exascale data and computer resources to squeeze the most out of image, sensor or simulation data Run lots of different algorithms to derive same features Run lots of algorithms to derive complementary features Data models and data management infrastructure to manage data products, feature sets and results from classification and machine learning algorithms

3 Pipelines to Carry out Feature Extraction and Classification in Brain Tumor Research

9 Analogous Feature Extraction and Classification Issues in Most Fields Astrophysics Material Science Which portions of a star s core are susceptible to implosion over time period [t1, t2]? Is crystalline growth likely to occur within range [p1, p2] of pressure conditions? Compute streamlines on vector field v within grid points [(x1,y1)-(x2,y2)] Compute likelihood of local cyclic relationships among nanoparticles within a frame Cancer studies Which regions of the tumor are undergoing active angiogenesis in response to hypoxia? Determine image regions where (blood vessel density > 20) and (nuclei and necrotic region are within 50 microns of each other)

10 Data Science Research Challenges Coordination and management of algorithms and metadata needed to carry out high throughput feature extraction and classification Interactive on-demand user interactivity with exascale complex multi-algorithm analysis frameworks Computer assisted annotation and markup for very large datasets development of the actual image analysis and machine learning algorithms Structural and semantic metadata management: how to manage tradeoff between flexibility and curation Data and semantic modeling infrastructures and policies able to scale to handle distributed systems with an aggregate of 10*9 or more data models/concepts

School of Systems Engineering A Novel Cloud Based Elastic Framework for Big Data Preprocessing Omer Dawelbeit and Rachel McCrindle October 21, 2014 University of Reading 2008 www.reading.ac.uk Overview

Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

Astrophysics with Terabyte Datasets Alex Szalay, JHU and Jim Gray, Microsoft Research Living in an Exponential World Astronomers have a few hundred TB now 1 pixel (byte) / sq arc second ~ 4TB Multi-spectral,

RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a

Predictive Analytics How many of you used predictive today? 2015 SAP SE. All rights reserved. 2 2015 SAP SE. All rights reserved. 3 How can you apply predictive to your business? Predictive Analytics is

Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by

Euro-BioImaging European Research Infrastructure for Imaging Technologies in Biological and Biomedical Sciences WP11 Data Storage and Analysis Task 11.1 Coordination Deliverable 11.2 Community Needs of

A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method

Data Warehousing Yeow Wei Choong Anne Laurent Databases Databases are developed on the IDEA that DATA is one of the cri>cal materials of the Informa>on Age Informa>on, which is created by data, becomes

INTERNATIONAL ADVANCED RESEARCH WORKSHOP ON HIGH PERFORMANCE COMPUTING AND GRIDS Cetraro (Italy), June 30 - July 4, 2008 Panel: From Grids to Cloud Services Towards a New Model for the Infrastructure Grid

Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of base and data mining group,

100 001 010 111 From Raw Data to 10011100 Actionable Insights with 00100111 MATLAB Analytics 01011100 11100001 1 Access and Explore Data For scientists the problem is not a lack of available but a deluge.

A.I. in health informatics lecture 1 introduction & stuff kevin small & byron wallace what is this class about? health informatics managing and making sense of biomedical information but mostly from an

Conference on Climate Change and Official Statistics Oslo, Norway, 14-16 April 2008 The Role of Spatial Data Infrastructure in Integrating Climate Change Information with a Focus on Monitoring Observed

Mission Need Statement for the Next Generation High Performance Production Computing System Project () (Non-major acquisition project) Office of Advanced Scientific Computing Research Office of Science

Healthcare Industry Skills Innovation Award Proposal Hippocratic Database Technology Li Xiong, Emory University I propose to design and develop a course focused on the values and principles of the Hippocratic

Rulex s Logic Learning Machines successfully meet biomedical challenges. Rulex is a predictive analytics platform able to manage and to analyze big amounts of heterogeneous data. With Rulex, it is possible,

Embedded Systems in Healthcare Pierre America Healthcare Systems Architecture Philips Research, Eindhoven, the Netherlands November 12, 2008 About the Speaker Working for Philips Research since 1982 Projects

Cloud Computing and the Future of Internet Services Wei-Ying Ma Principal Researcher, Research Area Manager Microsoft Research Asia Computing as Utility Grid Computing Web Services in the Cloud What is