ScaleMP Server Aggregation Adds Big Data Capabilities

This week ScaleMP released version 5 of their vSMP Foundation software for HPC. With support for new hardware platforms such as Intel’s new Xeon Phi co-processor, this server-virtualization-for-aggregation software now enables enables large-memory applications for Big Data and Big Science. To learn more, I caught up with ScaleMP’s CEO, Shai Fultheim.

insideHPC: What’s new with the vSMP Foundation 5 release?

Shai Fultheim: There are a number of new capabilities which extend our reach and broaden our market. The most significant features are that we will support the Intel TrueScale InfiniBand as well as the Intel Xeon Phi coprocessor, virtualizing it so that it is very easy to get applications up and running on that system. In addition, we expanded processor support for the latest generation of AMD processor and the Intel E5-4600; we have added new product focused on large memory VMs (Bio informatics and big-data) as well as support for vSMP Foundation running over KVM with Ethernet using Ethernet between nodes.

insideHPC: How does vSMP Foundation work with the new Intel Xeon Phi co-processors and what advantages does this computing approach have for HPC users?

Shai Fultheim: Our work with the Intel Xeon Phi coprocessor is very different than other HPC vendors. We are providing a means that developers or end users can use to get their applications running on a system with the Intel Xeon Phi coprocessor installed. We virtualize the Intel Xeon Phi coprocessors and memory with the host processors and memory to make the system look like one large system. Then, an application can be tuned little by little to take advantage of the performance and large number of cores on the Intel Xeon Phi. In addition, we support the standard co-processor programming model for customers looking to have large-scale shared memory systems supporting significant number of Intel Xeon Phi coprocessors.

insideHPC: How does ScaleMP help users with Big Data Analytics?

Shai Fultheim: The more data that can be stored in memory the better. As the amount of data grows that needs to be analyzed, the more of that data that can be stored in easily accessible memory from all of the cores the better. Using a single address space for high speed memory will be easier and faster than using slower flash or hard disk drives. vSMP Foundation 5 allows creating virtual machines that aggregate memory of several systems, without the processors – allowing for in-memory processing using a cost optimized hardware.

insideHPC: Does it improve Hadoop performance?

Shai Fultheim: It can. By allowing the distributed nodes to be larger in the sense of more memory then more data can be stored in memory rather than on hard disk drives. Another aspect is the instead of maintaining many nodes in a cluster that are used for Hadoop, lesser nodes can be used which eases the management burden of maintaining more nodes.

insideHPC: Can vSMP Foundation co-exist with other VMs out there? How does that work?

Shai Fultheim: With vSMP Foundation 5 we will offer the ability to connect previously created VMs (like from using KVM) together to create a SMP that can be quite large. We aggregate the individual VMs and then deliver an SMP created from the smaller VMs. Our software can span individual hardware systems, giving users much larger VMs than possible before.

insideHPC:How does vSMP Foundation deliver more performance with Batched I/O?

Shai Fultheim: Processors and devices need several interactions in for each I/O operation. For example, to get a drive to pull few sectors into memory, a processor needs to access memory 2-3 times and also issue at least one I/O write operation. vSMP Foundation 5 batches those operations in a memory buffer close to the processor, to provide faster execution across the InfiniBand fabric. We have seen a 2x improvement in disk transfer rate and up to 5x improvement in the network rate in some cases. Depending on the application, your mileage may vary.

Resource Links:

Industry Perspectives

In this special guest feature, Anjali Norwood, Co-founding Engineer at Arcadia Data, discusses the importance of getting girls involved in STEM at an early age, and ways to foster a culture of female empowerment in tech companies today. [Read More...]

White Papers

Data and the way that data is used have changed, but data warehousing has not. Today’s premises-based data warehouses are based on technology that is, at its core, two decades old. To meet the demands and opportunities of today, data warehouses have to fundamentally change.