14th ACM-IEEE International Conference On Formal Methods and Models For System Design

Indian Institute of Technology, Kanpur

November 18-20, 2016

As in previous years, The IPM HPC’s team won the honor of Memocode design contest 2016. Our team not only defended its previous years rank in performance/cost class, but also we won the performance class.

Contest Problem: The MEMOCODE’16 will include a design contest, which will pose a computational challenge that participants may solve using hardware or software on FPGAs, GPUs, and CPUs. The conference will sponsor at least one prize with a monetary award for the contest winners. The 2016 challenge is K-means clustering that is an unsupervised method for clustering multidimensional data points, aiming to partition the points into “K” subgroups (clusters) that are similar. This is used in a variety of applications such as data mining, image segmentation, medical imaging, and bioinformatics.

IPM-HPC Solution: Our method makes exhaustive use of four High throughput GPU and hides memory latency. The 2016 MEMOCODE Design Contest was to efficiently compute k-means clustering on a large multidimensional data set. We performed effective optimizations involving the algorithmic structure and parallelism. We implemented our design using Intel Xeon E5 CPUs and NVIDIA GTX 980 GPUs. Our overall best result computed the solution in 106ms using four GPUs. In terms of cost normalized results, our best solution was the 2x GPU implementation, which was only 1.5x slower than the 4x GPU solution, at half the cost. The IPM team’s implementation strategy involved careful parallelization of the problem across available platforms, as well as optimization of the arithmetic required by the problem. Moreover, The solution was based on Lloyd’s algorithm.

Contest Problem: As in previous years, MEMOCODE’15 will include a design contest, which will pose a computational challenge that participants may solve using hardware or software on FPGAs, GPUs, and CPUs. The conference will sponsor at least one prize with a monetary award for the contest winners. the 2015 challenge is continuous skyline computation that given dataset is not constant and it changes over time. the aim of this contest is to implement a system to efficiently compute the continuous skyline of a large dynamic dataset.

IPM-HPC Solution: Our method makes exhaustive use of CPU and minimizes memory access. we present an efficient parallel continuous Skyline approach. In our suggested method, the dataset points are sorted and pruned based on Manhattan distance. Moreover, we use several optimization methods to optimize memory usage in comparison with naïve implementation. In addition, besides the applied conventional parallelization methods, we partition the time steps based on the number of available cores. The experimental results for a data set that contains 800k points with 7 dimensions show considerable speedup.for more information see the paper.

Contest Problem:The 2014 problem for The MEMOCODE hardware contest problem is k-Nearest Neighbor search using the Mahalanobis distance metric. Given a data set of points in multidimensional space, the goal is to find the k points that are nearest to any given point in that space.

IPM-HPC Solution: Regarding to the MEMOCODE chair email, there were 10 excellent solutions submitted, utilizing a range of algorithmic as well as system-level optimizations, targeting FPGA, GPU as well as multi-core platforms. Our method makes exhaustive use of CPU and minimizes memory access. This method is the winner of Best Cost-Normalized Performance of MEMOCODE contest design 2014 and is 616X faster than other implementation of the contest. See this paper for more detail.

Contest Problem:The 2014 problem for The MEMOCODE software contest problem is to make the emulator run even faster on the Raspberry PI by proposing software solutions. These contest used specific emulator for Space Invaders game written for 8080 processor.

IPM-HPC Solution: Regarding to the MEMOCODE software contest report, there were three contestants were successful in completing this software design contest. These contestants employed a variety of techniques to both discover and optimize performance bottlenecks in the emulator. Our method is focused in optimizing function calls that were frequently invoked, and re-organizing data-structures to better match the underlying Raspberry Pi’s hardware architecture. These approach is 2.5 times faster over the conventional implementation of the contest. See this paper for more detail.

IPM-HPC laboratory won the Memocode Design Contest 2013 for the third time. Our team not only defended its previous years rank in performance/cost class, but also we won the performance class. The 2013 problem for the MEMOCODE Design Contest was part of the stereo matching problem: Given a stereo image pair, the challenge was to infer the depth information for each pixel in the image.

Regarding to the email that we have received from the design Contest chair, there were 8 excellent solutions submitted to the contest this year, utilizing a range of algorithmic as well as system-level optimizations, targeting FPGA, GPU as well as multi-core platforms. The top performing solutions, including ours, were provided the judging image to help the judges in making the final decision.

Based on the run-time, cost, and inference accuracy, IPM Team’s solution has been selected as the all-around winner.

IPM-HPC laboratory, defended its previous rank and won the Memocode design contest for the second time. The 2012 problem for the MEMOCODE Design Contest was part of the DNA sequence alignment problem: exact substring matching. The challenge was to efficiently locate millions of 100-base-pair short read sequences in a 3-million-base-pair reference genome.

IPM-HPC team provided two solutions: The first one is based on BWT data structure and the second one is based on Hash-indexing. Both of these methods use CPU-multicore and PThread technology. The Hash indexing method was one of the two winners of the contest. The method is able to align 1.2 GB of short read sequences in only 4 seconds. The hash-based solution become the winner of the normalized class of the contest. The hpc-team members of this year were: Aryan Arbabi,Milad Gholami, Mojtaba Varmazyar, Shervin Daneshpajouh. See details in the MEMOCODE page.