2019-05-18T21:49:09ZBenchPrime: Accurate Benchmark Subsetting with Optimized Clustering Algorithm Selectionhttp://hdl.handle.net/10919/84916
Liu, Qingrui; Wu, Xiaolong; Kittinger, Larry; Levy, Markus; Jung, Changhee
2018-08-24T00:00:00ZThis paper presents BenchPrime, an automated benchmark analysis toolset that is systematic and extensible to analyze the similarity and diversity of benchmark suites. BenchPrime takes multiple benchmark suites and their evaluation metrics as inputs and generates a hybrid benchmark suite comprising only essential applications. Unlike prior work, BenchPrime uses linear discriminant analysis rather than principal component analysis, as well as selects the best clustering algorithm and the optimized number of clusters in an automated and metric-tailored way, thereby achieving high accuracy. In addition, BenchPrime ranks the benchmark suites in terms of their application set diversity and estimates how unique each benchmark suite is compared to other suites. As a case study, this work for the first time compares the DenBench with the MediaBench and MiBench using four different metrics to provide a multi-dimensional understanding of the benchmark suites. For each metric, BenchPrime measures to what degree DenBench applications are irreplaceable with those in MediaBench and MiBench. This provides means for identifying an essential subset from the three benchmark suites without compromising the application balance of the full set. The experimental results show that the necessity of including DenBench applications varies across the target metrics and that significant redundancy exists among the three benchmark suites.A Composable Workflow for Productive FPGA Computing via Whole-Program Analysis and Transformation (with Code Excerpts)http://hdl.handle.net/10919/84388
Sathre, Paul; Helal, Ahmed; Feng, Wu
2018-07-24T00:00:00ZWe present a composable workflow to enable highly-productive heterogeneous computing on FPGAs. The workflow consists of a trio of static analysis and transformation tools: (1) a whole-program, source-to-source translator to transform existing parallel code to OpenCL, (2) a set of OpenCL kernel linters, which target FPGAs to detect possible semantic errors and performance traps, and (3) a whole-program OpenCL linter to validate the host-to-device interface of OpenCL programs. The workflow promotes rapid realization of heterogeneous parallel code across a multitude of heterogeneous computing environments, particularly FPGAs, by providing complementary tools for automatic CUDA-to-OpenCL translation and compile-time OpenCL validation in advance of very expensive compilation, placement, and routing on FPGAs. The proposed tools perform whole-program analysis and transformation to tackle realworld, large-scale parallel applications. The efficacy of the workflow tools is demonstrated via a representative translation and analysis of a sizable CUDA finite automata processing engine as well as the analysis and validation of an additional 96 OpenCL benchmarks.MOANA: Modeling and Analyzing I/O Variability in Parallel System Experimental Designhttp://hdl.handle.net/10919/82857
Cameron, Kirk W.; Anwar, Ali; Cheng, Yue; Xu, Li; Li, Bo; Ananth, Uday; Lux, Thomas; Hong, Yili; Watson, Layne T.; Butt, Ali R.
2018-04-19T00:00:00ZExponential increases in complexity and scale make variability a growing threat to sustaining HPC performance at exascale. Performance variability in HPC I/O is common, acute, and formidable. We take the first step towards comprehensively studying linear and nonlinear approaches to modeling HPC I/O system variability. We create a modeling and analysis approach (MOANA) that predicts HPC I/O variability for thousands of software and hardware configurations on highly parallel shared-memory systems. Our findings indicate nonlinear approaches to I/O variability prediction are an order of magnitude more accurate than linear regression techniques. We demonstrate the use of MOANA to accurately predict the confidence intervals of unmeasured I/O system configurations for a given number of repeat runs – enabling users to quantitatively balance experiment duration with statistical confidence.Modeling Influence using Weak Supervision: A joint Link and Post-level Analysishttp://hdl.handle.net/10919/82748
Chen, Liangzhe; Prakash, B. Aditya
2018-04-09T00:00:00ZMicroblogging websites, like Twitter and Weibo, are used by billions of people to create and spread information. This activity depends on various factors such as the friendship links between users, their topic interests and social influence between them. Making sense of these behaviors is very important for fully understanding and utilizing these platforms. Most prior work on modeling social-media either ignores the effect of social influence, or considers its effect only on link formation or post generation. In contrast, in this paper we propose POLIM, which jointly models the effect of influence on both link and post generation, leveraging weak supervision. We also give POLIM-FIT, an efficient parallel inference algorithm for POLIM which scales to large datasets. In our experiments on a large tweets corpus, we detect meaningful topical communities, celebrities, as well as the influence strengths patterns among them. Further, we find that there are significant portions of posts and links that are caused by influence, and this portion increases when the data focuses on a specific event. We also show that differentiating and identifying these influenced content benefits other quantitative downstream tasks as well, like predicting future tweets and link formation.Segmentations with Explanations for Outage Analysishttp://hdl.handle.net/10919/82747
Chen, Liangzhe; Muralidhar, Nikhil; Chinthavali, Supriya; Ramakrishnan, Naren; Prakash, B. Aditya
2018-04-09T00:00:00ZRecent hurricane events have caused unprecedented amounts of damage and severely threatened our public safety and economy. The most observable (and severe) impact of these hurricanes is the loss of electric power in many regions, which causes the breakdown of many public services. Understanding the power outages and how they evolve during a hurricane provide insights on how to reduce outages in the future, and how to improve the robustness of the underlying critical infrastructure systems. In this paper, we propose a novel segmentation with explanations framework to help experts understand such datasets. Our method, CUT-n-REVEAL, first finds a segmentation of the outage sequences to capture pattern changes in the sequences. We then propose a novel explanation optimization problem to find an intuitive explanation of the segmentation, that highlights the culprit of the change. Via extensive experiments, we show that our method performs consistently in multiple datasets with ground truth. We further study real county-level power outage data from several recent hurricanes (Matthew, Harvey, Irma) and show that CUT-n-REVEAL recovers important, nontrivial and actionable patterns for domain experts.GPU Power Prediction via Ensemble Machine Learning for DVFS Space Explorationhttp://hdl.handle.net/10919/81997
Dutta, Bishwajit; Adhinarayanan, Vignesh; Feng, Wu-chun
2018-02-02T00:00:00ZA software-based approach to achieve high performance within a power budget often involves dynamic voltage and frequency scaling (DVFS). Consequently, accurately predicting the power consumption of an application at different DVFS levels (or more generally, different processor configurations) is paramount for the energy-efficient functioning of a high-performance computing (HPC) system. The increasing prevalence of graphics processing units (GPUs) in HPC systems presents new multi-dimensional challenges in power management, and machine learning presents an unique opportunity to improve the software-based power management of these HPC systems. As such, we explore the problem of predicting the power consumption of a GPU at different DVFS states via machine learning. Specifically, we perform statistically rigorous experiments to quantify eight machine-learning techniques (i.e., ZeroR, simple linear regression (SLR), KNN, bagging, random forest, sequential minimal optimization regression (SMOreg), decision tree, and neural networks) to predict GPU power consumption at different frequencies. Based on these results, we propose a hybrid ensemble technique that incorporates SMOreg, SLR, and decision tree, which, in turn, reduces the mean absolute error (MAE) to 3.5%.ETH: A Framework for the Design-Space Exploration of Extreme-Scale Visualizationhttp://hdl.handle.net/10919/79454
Abrams, Gregory; Adhinarayanan, Vignesh; Feng, Wu-chun; Rogers, David; Ahrens, Jams; Wilson, Luke
2017-09-29T00:00:00ZAs high-performance computing (HPC) moves towards the exascale era, large-scale scientific simulations are generating enormous datasets. A variety of techniques (e.g., in-situ methods, data sampling, and compression) have been proposed to help visualize these large datasets under various constraints such as storage, power, and energy. However, evaluating these techniques and understanding the various trade-offs (e.g., performance, efficiency, quality) remains a challenging task. To enable the investigation and optimization across such tradeoffs, we propose a toolkit for the early-stage exploration of visualization and rendering approaches, job layout, and visualization pipelines. Our framework covers a broader parameter space than existing visualization applications such as ParaView and VisIt. It also promotes the study of simulation-visualization coupling strategies through a data-centric approach, rather than requiring the code itself. Furthermore, with experimentation on an extensively instrumented supercomputer, we study more metrics of interest than was previously possible. Overall, our framework will help to answer important what-if scenarios and trade-off questions in early stages of pipeline development, helping scientists to make informed choices about how to best couple a simulation code with visualization at extreme scale.CommAnalyzer: Automated Estimation of Communication Cost on HPC Clusters Using Sequential Codehttp://hdl.handle.net/10919/78701
Helal, Ahmed E.; Jung, Changhee; Feng, Wu-chun; Hanafy, Yasser Y.
2017-08-14T00:00:00ZMPI+X is the de facto standard for programming applications on HPC clusters. The performance and scalability on such systems is limited by the communication cost on different number of processes and compute nodes. Therefore, the current communication analysis tools play a critical role in the design and development of HPC applications. However, these tools require the availability of the MPI implementation, which might not exist in the early stage of the development process due to the parallel programming effort and time. This paper presents CommAnalyzer, an automated tool for communication model generation from a sequential code. CommAnalyzer uses novel compiler analysis techniques and graph algorithms to capture the inherent communication characteristics of sequential applications, and to estimate their communication cost on HPC systems. The experiments with real-world, regular and irregular scientific applications demonstrate the utility of CommAnalyzer in estimating the communication cost on HPC clusters with more than 95% accuracy on average.Personal Reflections on 50 Years of Scientific Computing: 1967–2017http://hdl.handle.net/10919/78691
Watson, Layne T.
2017-08-10T00:00:00ZComputer hardware, software, numerical algorithms, and science and engineering applications are traced for a half century from the author's perspective.Understanding Recurring Software Quality Problems of Novice Programmershttp://hdl.handle.net/10919/78337
Techapalokul, Peeratham; Tilevich, Eli
2017-07-12T00:00:00ZIt remains unclear when is the right time to introduce software quality into the computing curriculum. Introductory students often cannot afford to also worry about software quality, while advanced students may have been groomed into undisciplined development practices already. To be able to answer these questions, educators need strong quantitative evidence about the persistence of software quality problems in programs written by novice programmers. This technical report presents a comprehensive study of software quality in programs written by novice programmers. By leveraging the patterns of recurring quality problems, known as code smells, we analyze a longitudinal dataset of more than 100 novice Scratch programmers and close to 3,000 of their programs. Even after gaining proficiency, students continue to introduce certain quality problems into their programs, suggesting the need for educational interventions. Given the importance of software quality for modern society, computing educators should teach quality-promoting practices alongside the core computing concepts.