Benchmarking Intel Xeon Phi vs. Sandy Bridge

Intel has been careful to label the Xeon Phi as a coprocessor, something that always pairs with a Xeon CPU. But how does their performance compare on real applications? Over at the Xcelerit Blog, Paul Sutton benchmarks both devices using an optimized parallel version of the Monte-Carlo LIBOR swaption portfolio pricer.

It is executed once on the host CPUs (the Sandy Bridge processors), and again on the Xeon Phi co-processor in offload mode. The execution time of the full application is measured, including data transfers, random number generation, and reduction. All these steps are running on the target processor.

As we can see, from about 100K paths onwards, the Intel Xeon Phi becomes faster than the Sandy Bridge processors, reaching nearly 3x at 1M paths. With lower numbers of paths, the Sandy Bridge outperforms the Phi. This can be explained by the added data transfers and the comparably low level of parallelism for a low number of paths (considering both vectorization and multi-threading). The setup time for the random number generator also becomes more dominant on the Xeon Phi when there is relatively little computation performed.

Resource Links:

Latest Video

Industry Perspectives

Christian Kniep from Docker Inc. gave this talk at the Stanford HPC Conference. "This talk will recap the history of and what constitutes Linux Containers, before laying out how the technology is employed by various engines and what problems these engines have to solve. Afterward, Christian will elaborate on why the advent of standards for images and runtimes moved the discussion from building and distributing containers to orchestrating containerized applications at scale." [Read More...]

White Papers

This guide to artificial intelligence explains the difference between AI, machine learning and deep learning, and examine the intersection of AI and HPC. To learn more about AI and HPC download this guide.