Data-driven computing needs no introduction today. The case for using data for strategic advantages is exemplified by web search engines, online translation tools and many more examples. The past decade has seen 1) the emergence of multicore architectures and accelerators as GPGPUs, 2) widespread adoption of distributed computing via the map-reduce/hadoop eco-system and 3) democratization of the infrastructure for processing massive datasets ranging into petabytes by cloud computing.
The complexity of the technological stack has grown to an extent where it is imperative to provide frameworks to abstract away the system architecture and orchestration of components for massive-scale processing. However, the growth in volume and heterogeneity in data seems to outpace the growth in computing power. A "collect everything" culture stimulated by cheap storage and ubiquitous sensing capabilities contribute to increasing the noise-to-signal ratio in all collected data. Thus, as soon as the data hits
the processing infrastructure, determining the value of information, finding its rightful place in a knowledge representation and determining subsequent actions are of paramount importance. To use this data deluge to our advantage, a convergence between the field of Parallel and Distributed Computing and the interdisciplinary science of Artificial Intelligence seems critical. From application domains of national importance as cyber-security, health-care or smart-grid to providing real-time situational awareness
via natural interface based smartphones, the fundamental AI tasks of Learning and Inference need to be enabled for large-scale computing across this broad spectrum of application domains.

Many of the prominent algorithms for learning and inference are notorious for their complexity. Adopting parallel and distributed computing appears as an obvious path forward, but the mileage varies depending on how amenable the algorithms are to parallel processing and secondly, the availability of rapid prototyping capabilities with low cost of entry. The first issue represents a wider gap as we continue to think in a sequential paradigm. The second issue is increasingly recognized
at the level of programming models, and building robust libraries for various machine-learning and inferencing tasks will be a natural progression. As an example, scalable versions of many prominent graph algorithms written for distributed shared memory architectures or clusters look distinctly different from the textbook versions that generations of programmers have grown with. This reformulation is difficult to accomplish for an interdisciplinary field like Artificial Intelligence for the sheer breadth of the
knowledge spectrum involved. The primary motivation of the proposed workshop is to invite leading minds from AI and Parallel & Distributed Computing communities for identifying research areas that require most convergence and assess their impact on the broader technical landscape.

HIGHLIGHTS

Foster collaboration between HPC community and AI community

Applying HPC techniques for learning problems

Identifying HPC challenges from learning and inference

Explore a critical emerging area with strong academia and industry interest

Great opportunity for researchers worldwide for collaborating with Academia and Industry

ADVANCED PROGRAM

8:20-8:30 Open Remark

8:30-9:30 Keynote 1

Professor Eric Xing (CMU), On The Algorithmic and System Interface of BIG LEARNING

Abstract: In many modern applications built on massive data and using high-dimensional models, such as web-scale content extraction via topic models, genome-wide association mapping via sparse regression, and image understanding via deep neural network, one needs to handle BIG machine learning problems that threaten to exceed the limit of current infrastructures and algorithms. While ML community continues to strive for new scalable algorithms, and several attempts on developing new system architectures for BIG ML have emerged to address the challenge on the backend, good dialogs between ML and system remain difficult --- most algorithmic research remain disconnected from the real system/data they are to face; and the generality, programmability, and theoretical guarantee of most systems on ML programs remain largely unclear. In this talk, I will present Petuum -- a general-purpose framework for distributed machine learning, and demonstrate how innovations in scalable algorithms and distributed systems design work in concert to achieve multiple orders of magnitude of scalability on a modest cluster for a wide range of large scale problems in social network (mixed-membership inference on 100M node), personalized genome medicine (sparse regression on 100M dimensions), and computer vision (classification over 20K labels), with provable guarantee on correctness of distributed inference.

Bio: Dr. Eric Xing is an associate professor in the School of Computer Science at Carnegie Mellon University. His principal research interests lie in the development of machine learning and statistical methodology; especially for solving problems involving automated learning, reasoning, and decision-making in high-dimensional and dynamic possible worlds; and for building quantitative models and predictive understandings of biological systems. Professor Xing received a Ph.D. in Molecular Biology from Rutgers University, and another Ph.D. in Computer Science from UC Berkeley. His current work involves, 1) foundations of statistical learning, including theory and algorithms for estimating time/space varying-coefficient models, sparse structured input/output models, and nonparametric Bayesian models; 2) computational and statistical analysis of gene regulation, genetic variation, and disease associations; and 3) application of statistical learning in social networks, data mining, vision. Professor Xing has published over 150 peer-reviewed papers, and is an associate editor of the Journal of the American Statistical Association, Annals of Applied Statistics, the IEEE Transactions of Pattern Analysis and Machine Intelligence, the PLoS Journal of Computational Biology, and an Action Editor of the Machine Learning journal. He is a recipient of the NSF Career Award, the Alfred P. Sloan Research Fellowship in Computer Science, the United States Air Force Young Investigator Award, and the IBM Open Collaborative Research Faculty Award.

Dr. Simon KahanUniversity of Washington

Title: Grappa: chaos, order, and easier cluster computing

Abstract: Systems demand chaotic parallelism while components demand order. Graceful transformations between the two is necessary for high
performance. Grappa performs these transformations. Grappa is a new latency-tolerant runtime system for distributed-memory commodity clusters that provides a shared-memory programming model for in-memory computation similar to what TBB and Cilk provide on single node platforms. Grappa implementations of map/reduce, the GraphLab API, and a Raco backend show promising performance in comparison to the specialized platforms Spark, GraphLab, and Shark, respectively. In addition, Grappa supports general computation, including complex irregular applications that have poor locality and data-dependent load distribution. Source is available for download from github.

Bio: Simon Kahan holds affiliate positions in the Computer Science and Engineering department at the University of Washington, at the Institute for Systems Biology, and at the Northwest Institute for Advanced Computing. He has held positions as a research scientist at the Pacific Northwest National Laboratory, senior member of technical staff at Google, and principal engineer at Cray Inc. He received his PhD in Computer Science from the University of Washington in 1991 and BS and MS degrees in Electrical Engineering from UC Berkeley in 1983 and 1985.

CALL FOR PAPERS

Authors are invited to submit manuscripts of original unpublished research that demonstrate a strong interplay between parallel/distributed computing techniques and learning/inference applications, such as algorithm design and libraries/framework development on multicore/ manycore architectures, GPUs, clusters, supercomputers, cloud computing platforms that target applications including but not limited to:

Submitted manuscripts may not exceed 10 single-spaced double-column pages using 10-point size font on 8.5x11 inch pages (IEEE conference style), including figures, tables, and references. More format requirements will be posted on the IPDPS web page (www.ipdps.org) shortly after the author notification Authors can purchase up to 2 additional pages for camera-ready papers after acceptance. Please find details on www.ipdps.org. Students with accepted papers have a chance to apply
for a travel award. Please find details at www.ipdps.org.

Camera-ready paper should be submitted to the IEEEConfPublishing Portal. See instructions on the IPDPS webpage

PROCEEDINGS

All papers accepted by the workshop will be included in the proceedings of the IEEE International Symposium on Parallel & Distributed Processing, Workshops and PhD Forum (IPDPSW), indexed in EI and possibly in SCI.