HPCwire » virtualizationhttp://www.hpcwire.com
Since 1986 - Covering the Fastest Computers in the World and the People Who Run ThemWed, 04 Mar 2015 00:05:08 +0000en-UShourly1http://wordpress.org/?v=4.1.1Accelerating the Cloud with GPUshttp://www.hpcwire.com/2014/05/16/accelerating-cloud-gpus/?utm_source=rss&utm_medium=rss&utm_campaign=accelerating-cloud-gpus
http://www.hpcwire.com/2014/05/16/accelerating-cloud-gpus/#commentsFri, 16 May 2014 22:06:46 +0000http://www.hpcwire.com/?p=12840If you were to come up with a list of transformative technologies to hit HPC in the last decade, cloud services and general-purpose GPUs would rank pretty high. While the idea of using virtual machines to run technical computing workloads was anathema to some, at least initially, the benefits were hard to argue with. Ease Read more…

]]>If you were to come up with a list of transformative technologies to hit HPC in the last decade, cloud services and general-purpose GPUs would rank pretty high. While the idea of using virtual machines to run technical computing workloads was anathema to some, at least initially, the benefits were hard to argue with. Ease of use, scalability, elasticity, and pay-as-you-go pricing were all major draws, but there was still the matter of overhead, i.e., the virtualization penalty. In terms of sheer performance, bare metal had the advantage.

As cloud grew in popularity, so did something called GPGPU computing, that is using general purpose graphics processing units to accelerate computing jobs. Perhaps the two technologies, GPUs and virtualization, could be combined to create a cloud environment that would satisfy the needs of HPC workloads. A group of computer scientists set out to explore this very question.

“We propose to bridge the gap between supercomputing and clouds by providing GPU-enabled virtual machines,” the team writes in a recently published paper. “Specifically, the Xen hypervisor is utilized to leverage specialized hardware-assisted I/O virtualization tools in order to provide advanced HPC-centric NVIDIA GPUs directly in guest VMs.”

It’s not the first time that GPU-enabled virtual machines have been tried. Amazon EC2 and a couple other cloud vendors have a GPU offering, however the approach is not without challenges and there are different ideas about how to best implement a GPU-based cloud environment, which the paper addresses.

The team of scientists, from Indiana University and the Information Sciences Institute (ISI), a unit of the University of Southern California’s Viterbi School of Engineering, compared the performance of two Tesla GPUs in a variety of applications using the native and the virtualized modes. To carry out the experiments, the researchers used two different machines, one outfitted with Fermi GPUs and and the other with newer Kepler chips. After running several benchmarks and assessing the results, the authors conclude that the GPU-backed virtual machines are viable for a range of scientific computing workflows. On average, the performance hit was 2.8 percent for Fermi GPUs and and 4.9 percent for Kepler GPUs.

In their comparison of virtualized environments with bare metal ones, the team studied three data points: FLOPS, device bandwidth and PCI bus performance. Among the more notable results in the FLOPS testing portion, the team found that even for double-precision FLOPS, the Kepler GPUs achieved nearly a doubling in performance. But more applicable to the research at hand, when it comes to raw FLOPS available to each GPU in both native and virtualized modes, the virtualization overhead was between 0 and 2.9 percent.

When other applications were tested, the performance penalty ranged from 0 percent in some cases to over 30 percent. The FFT benchmarks resulted in the most overhead, while the matrix multiplication based benchmarks had an average overhead of 2.9 percent for the virtualized setups.

In terms of device speed, which was measured in both raw bandwidth and 3rd party benchmarks, virtualization had “minimal or no significant performance impact.”

The final dimension being tested, the PCI express bus, had the highest potential for overhead, according to the research. “This is because the VT-d and IOMMU chip instruction sets interface directly with the PCI bus to provide operational and security related mechanisms for each PC device, thereby ensuring proper function in a multi-guest environment” the authors state. “As such, it is imperative to investigate any and all overhead at the PCI Express bus.”

In analyzing these results, the authors note that “as with all abstraction layers, some overhead is usually inevitable as a necessary trade-off to added feature sets and improved usability.”

While the same is true for GPU-equipped virtual machines, the research team contends that the overhead is minimal. They add that their method of direct PCI-Passthrough of NVIDIA GPUs using the Xeon hypervisor can be cleanly implemented within many Infrastructure-as-a-Service environments. The next step for the team will be integrating this model with the OpenStack nova IaaS framework with the aim of enabling researchers to create their own private clouds.

]]>http://www.hpcwire.com/2014/05/16/accelerating-cloud-gpus/feed/0A Large Memory System for Freehttp://www.hpcwire.com/2014/03/10/large-memory-system-free/?utm_source=rss&utm_medium=rss&utm_campaign=large-memory-system-free
http://www.hpcwire.com/2014/03/10/large-memory-system-free/#commentsMon, 10 Mar 2014 15:45:20 +0000http://www.hpcwire.com/?p=7911Do you have an application would run faster if it could use high speed memory rather than disk? Did you ever want to run an application that needed more memory than your current server supports? ScaleMP has the answer. vSMP Foundation Free allows you to create a virtual system that seamlessly addresses up to 1 Read more…

]]>Do you have an application would run faster if it could use high speed memory rather than disk? Did you ever want to run an application that needed more memory than your current server supports?

ScaleMP has the answer. vSMP Foundation Free allows you to create a virtual system that seamlessly addresses up to 1 Terabyte (TB) of memory, aggregated from adjacent systems – and the best news – it is free ! A simple registration process and easy download allow you to connect up to 8 servers to create a 1 TB virtual machine for free. Yes, you have to already have the servers, but by installing vSMP Foundation Free allows you can:

Keep an entire database in memory:

There is no need to break up your database into multiple chunks. Using a large memory system, your entire database can remain in memory.

Make quicker decisions by investigating more data. i.e. Big Data:

When more and varied data is kept in memory, better decisions can be made by analyzing more complete data.

Work on more complete genomics information:

As Next Gen Sequencers produce more information than ever before, having a system that can handle this massive amount of data in critical in all workflows.

Simulate larger MCAE models by using more memory for the model and temporary I/O:

With a large memory system, higher fidelity models can be analyzed as well as faster execution by storing temporary information in memory, rather than relying on slower hard disk drives or flash memory.

Run larger research projects with more complete physics:

Large memory systems allow you to simulate more physics in many application domains.

Ask yourself:

What would happen if could fit more of your data in-memory ?

How would your insights be different ?

Would you make different decisions ?

Let’s look a quick example. Your database is about 1 TB in size. If you have 8 servers available, each with only 128 GB of memory, you would have to divide up your database into 8 smaller chunks, each at about 1/8 of the total. However, by aggregating the 8 servers into one system, the entire 1 TB database could be loaded into memory and queries made on the entire amount of data.

Figure 1 below graphically shows how a large memory solution would be created. The first server with its I/O,CPUs, and memory, together with memory from other nodes are available for the application – just like it was a single machine. In fact, it is a single machine. Note that only one operating system is installed.

Figure 1 – Large Memory Solution Diagram

In summary, a large memory system can be created for free, with vSMP Foundation Free. Up to 8 servers and 1 TB of memory can be presented to the application.

]]>http://www.hpcwire.com/2014/03/10/large-memory-system-free/feed/0The Benefits of Bare-Metal Cloudshttp://www.hpcwire.com/2013/08/27/the_benefits_of_bare-metal_clouds/?utm_source=rss&utm_medium=rss&utm_campaign=the_benefits_of_bare-metal_clouds
http://www.hpcwire.com/2013/08/27/the_benefits_of_bare-metal_clouds/#commentsTue, 27 Aug 2013 07:00:00 +0000http://www.hpcwire.com/2013/08/27/the_benefits_of_bare-metal_clouds/Cloud computing promises numerous benefits to businesses – among these are agility, scalability, and reduced cost – but the virtualization layer inherent in most public clouds has been somewhat of an anathema to the HPC community. Are bare metal clouds the answer?

]]>Cloud computing promises numerous benefits to businesses – among these are agility, scalability, and reduced cost – but the virtualization layer inherent in most public clouds has been somewhat of an anathema to the HPC community. Are bare metal clouds the answer?

Today’s clouds are built on a mix of technologies, including virtualization, automation and orchestration. In some circles, virtualization is nearly synonymous with cloud, but above all a cloud is a pool of resources that is elastic, scalable and accessible on-demand. For the HPC community especially, much of the cloud computing that takes place is of the “bare metal” kind, aka non-virtualized cloud.

Although today’s server virtualization is a lot sleeker than in years past, nothing can beat the performance of bare metal. Some of the issues with virtualized cloud computing were detailed recently by Internap Vice President of Hosted Services Gopala Tumuluri.

Tumuluri points out what most HPCers already know: the virtualized, multi-tenant platform common to most public clouds is subject to performance degradation. “While the hypervisor enables the visibility, flexibility and management capabilities required to run multiple virtual machines on a single box, it also creates additional processing overhead that can significantly affect performance,” writes Tumuluri.

Data-heavy loads are the most likely to be negatively impacted, especially when the service is oversubscribed. Such a setting is ripe for the so-called noisy-neighbor problem that occurs when too many virtual machines compete for server resources.

This is where the bare metal cloud offers a significant advantage, especially for latency-sensitive workloads. Dedicated hardware delivered as a service offers the user the benefits of cloud – flexibility, scalability and efficiency – without the drawbacks of a shared server.

“It’s important not to confuse true bare-metal cloud capabilities with other, related terminology, such as ‘dedicated instances,’ which can still be part of a multi-tenant environment; and ‘bare-metal servers,’ or ‘dedicated servers,’ which could refer to a managed hosting service that involves fixed architectures and longer-term contracts,” he writes. “A bare-metal cloud model enables on-demand usage and metered hourly billing with physical hardware that was previously only sold on a fully dedicated basis.”

The bare metal cloud is good match for bursty I/O-heavy workloads. Ideal use cases include media encoding and render farms, which are both periodic, and data-intensive in nature. “In the past, organizations couldn’t put these workloads into the cloud, or they simply had to accept lower performance levels,” remarks Tumuluri.

Companies with big data applications may also be interested in exploring the bare metal option. It’s definitely possible for high volume, high velocity data to encounter latency issues on virtualized cloud servers.

Last but not least is the issue of security. Organizations with stringent compliance guidelines – think finance, government and healthcare – can benefit by having their data contained in a well-defined physical environment.

As an umbrella term, Infrastructure-as-a-Service includes many different scenarios that appeal to different use cases. For some HPC workloads, dedicated hosting options with cloud-like features (elasticity, ease-of-use, utility-style billing, etc.) may offer the best of both worlds.

]]>http://www.hpcwire.com/2013/08/27/the_benefits_of_bare-metal_clouds/feed/0The Evolution of the Virtual Platformhttp://www.hpcwire.com/2013/05/01/the_evolution_of_the_virtual_platform/?utm_source=rss&utm_medium=rss&utm_campaign=the_evolution_of_the_virtual_platform
http://www.hpcwire.com/2013/05/01/the_evolution_of_the_virtual_platform/#commentsWed, 01 May 2013 07:00:00 +0000http://www.hpcwire.com/?p=8556At the 2013 Open Fabrics International Developer Workshop, in Monterey, California, VMware's in-house HPC expert Josh Simons delivered a presentation on the Software-Defined Datacenter. At its essence, a software-defined datacenter is a prescriptive model for bringing the benefits of virtualization to rest of the datacenter.

]]>At the 2013 Open Fabrics International Developer Workshop, in Monterey, California, VMware’s in-house HPC expert Josh Simons delivered a presentation [slides] on the Software-Defined Datacenter. While Simons mainly inhabits the HPC space, he donned his enterprise hat for this talk.

The phrase software-defined datacenter started making the rounds in the second half of 2012, spearheaded by VMware. At its essence, a software-defined datacenter is a prescriptive model for bringing the benefits of virtualization to rest of the datacenter. It is an enabling technology for what some are calling Cloud 2.0.

In discussing the evolution of virtual platforms, Simons says “the next leap is going beyond the single-datacenter or beyond small or modestly-sized clusters to actually supporting a hybrid cloud model where you want mobility of those applications and workloads across a much wider range. You want to be able to do it across a full datacenter and also do it between, say, a private cloud deployment and a public cloud deployment.”

According to Simons, getting to this next level, i.e., achieving this robust, scalable hybrid cloud, means first putting in place the software-defined datacenter.

“We create a virtual datacenter abstraction underpinned by a set of all software services that allow us to provision networking, provision storage, provision storage, and compute and memory, etc., all the resources that you need to stand up the service that you’re intent on standing up, but do that totally from a software-based perspective and do it at scale.

“That’s absolutely necessary if we’re going to move into the cloud. And that is, simply put, what the software-defined data center is about.

“The software-defined data center is the architecture that let’s us deliver the cloud. It’s how we would build it as a provider of software to customers that are building clouds. This is the underpinning that you would use to do this.”

Where cloud is a way of offering computing services that prioritizes:

• Self-service

• Elasticity

• Pay-by-use

• Agility

The software-defined datacenter provides an architecture for cloud where:

• All infrastructure is virtualized

• Delivered as a service

• Control of this datacenter is entirely automated by software

In differentiating software-defined networking and the software-defined datacenter, Simons explains that the latter is a critical component for delivering the full value of cloud computing, while the former is the means by which networking will be decoupled from underlying hardware.

The presentation also includes a discussion of the benefits of RDMA, which provides direct access to the physical hardware with performance as the primary goal.

While it is still early-stage thinking, Simons suggests that there may be a way to access the benefits of SDN and RDMA. Currently, datacenter workloads are only about 50 percent virtualized; and while that number is set to increase, there will always be a segment of workloads that requires the performance of bare metal. Additionally, the need for low-latency, high-bandwidth interconnect in the enterprise is a clear trend (to support scale-out DBMS, big data, HPC, and so forth). Simons asserts that future SDDC and SDN implementations must accommodate these realities, so the question then becomes how to do that.

One possible path forward, according to Simons, is to have the SDN layer reach down into the physical infrastructure to pull more data out of there, employing techniques like metrics and topology sensing so that more optimized placement decisions can be done by the SDN layer in support of delivering better application performance. VMware engineers are contemplating the possibility of using RoCE as the basis of an SDN environment that also supports RDMA.

Simons concludes that RDMA is clearly important for future enterprise datacenters, but he emphasizes that the work is at a very early stage. VMware welcomes community involvement to advance this goal.

]]>Our HPC cloud research stories are hand-selected from leading science centers, prominent journals and relevant conference proceedings. The top piece this week lays out a lightweight approach to implementing virtual machine monitors. Other items explore an innovative parallel cloud storage system, HPC-to-cloud migration, anywhere-anytime cluster monitoring, and a framework for cloud storage.

A Lightweight VMM for a Multicore Era

While traditional virtualization, using virtual machine monitors (VMMs) holds many efficiency advantages, the HPC community has generally shied away from the technology because of the associated performance overhead and the increased potential for bugs and other vulnerabilities. A team of researchers from Xi’an Jiaotong University in China is tackling these challenges with a novel lightweight approach to the creation of VMMs.

They note: “As resources in a multi-core server increase to more than adequate in the future, virtualization is not necessary although it provides convenience for cloud computing. Based on the above observations, this paper presents an alternative way for constructing a VMM: conﬁguring a booting interface instead of virtualization technology.”

Their paper, Lightweight VMM on Many Core for High Performance Computing,” describes their experience with a lightweight virtual machine monitor they call OSV.

“Rather than virtualizing resources in the computer, OSV only virtualizes the multiprocessor and memory conﬁguration interfaces,” they state. “Operating systems access the resources allocated by OSV directly without the intervention of the VMM, and OSV just controls which part of the resources are accessible to an operating system.”

OSV has the ability to host multiple Linux kernels with very little impact on performance, the team explains. It uses only 6 hyper-calls, while Linux, running on top of OSV, is intercepted only for inter-processor interrupts. The resource isolation is carried out with hardware-assist virtualization and the resource sharing is controlled by distributed protocols embedded in the operating systems.

Their test cases uses a prototype of OSV on 32-core Opteron-based servers with SVM and cache-coherent NUMA architectures. The setup supports up to 8 Linux kernels on the server with less than 10 lines of code modifications to the Linux kernel. With only 8,000 lines of code, OSV supports a streamlined tuning and debugging process. The final result of the experiment showed a 23.7 percent performance increase over Xen VMM.

A team from Los Alamos National Laboratory has revealed how they used the Swift Object Store from OpenStack as their disk-based cloud storage system. For the team, Swift has provided an “open source software for creating redundant, scalable object storage using clusters of standardized servers to store petabytes of accessible data.”

At the heart of this effort is to address growing HPC requirements on the archiving side. They note that just buying more tape or hard drives to keep up with demand is not a viable solution and they believe that “merging advanced features from both HPC systems and cloud systems is a promising direction.”

They reiterate that this is not a file system or real-time storage approach, but rather a “long term storage system for a more permanent type of static data that can be retrieved, leveraged and then updated if necessary.”

As the team behind the project states,

At LANL, we have worked on high-performance computing (HPC) systems for many years. The LANL parallel log file system (PLFS) has demonstrated its superior capability for the conversion of logical N-to-1 parallel I/O operations into physical N-to-N parallel I/O operations on HPC production systems. In this article, we describe the leveraging of the scaling capability of cloud object storage systems and the transformative parallel I/O feature (Fig. 1) of the LANL PLFS and the building of a parallel cloud storage system.

A new case study at IBM developerWorks lays out the steps involved in porting a massively parallel bioinformatics pipeline to the cloud. The transferring, stabilizing, and managing of massive data sets are all addressed by the team of IBM architects and engineers as are the architectural decisions that were necessary for this transformation.

“

Recent breakthroughs in genomics have significantly reduced the cost of short-read genomic sequencing (determining the order of the nucleotide bases in a molecule of DNA),” they write. “Therefore, to a large extent, the task of full genomic reassembly – often referred to assecondary analysis (and familiar to those with parallel processing experience) – has become an IT challenge in which the issues are about transferring massive amounts of data over WANs and LANs, managing it in a distributed environment, ensuring stability of massively parallel processing pipelines, and containing the processing cost.”

In describing their experience porting a commercial HPC workload for genomic reassembly to a cloud environment, the authors say it’s like going from a pure HPC environment to a more big data type of approach.

The impetus for the move is a familiar one: the HPC infrastructure was approaching capacity and the volume of the analysis work was expected to rise substantially. So the goal of the project was to test the feasibility of a massively scalable cloud infrastructure while keeping costs down.

A trio of computer scientists from Shandong University in Jinan, China, are exploring the feasibility of anywhere, anytime cluster monitoring. More specifically, they are working to design and implement a cluster monitoring system based on Android.

The team starts with the view that high performance computing (HPC) has been democratized to the point that HPC clusters have become an important resource for many scientific fields, including graphics, biology, physics, climate research, and many others. Still, depending on local funding realities, the availability of such machines is almost universally constrained. In light of this, monitoring becomes an essential task necessary for the efficient utilization and management of limited resources. However, as the researchers observe, traditional cluster monitoring systems demonstrate poor mobility, which stymies proper management.

The authors are seeking to improve the flexibility of monitoring systems and improve the communication between administrators. They assert that the mobile cluster monitoring system outlined in their paper “will make it possible to monitor the whole cluster anywhere and anytime to allow administrators to manage, diagnose, and troubleshoot cluster issues more accurately and promptly.”

The system they developed is based on the Android platform, the brainchild of Google, and built on open source monitoring tools, Gaglia and Nagios. The design uses a client-server model, where the server probes the data via monitoring tools and produces a global view of the data. The mobile client gets the monitoring packages by Socket. Then, the cluster’s status is displayed on the Android application.

UK computer scientists Victor Chang, Robert John Walters and Gary Wills set out to explore the topic of cloud storage and bioinformatics in a private cloud deployment. They’ve written a paper about their experience to serve as a resource for other researchers with data-intensive compute needs who are interested in analyzing the benefits of a cloud model.

Among the many benefits of the cloud model are its cost-savings potential, agility, efficiency, resource consolidation, business opportunities and possible energy savings. Despite the inherent attractiveness, there are still barriers to overcome, and one of these, according to the authors is the need for a standard or framework to manage both operations and IT services.

They write that “this framework needs to provide the structure necessary to ensure any cloud implementation meets the business needs of industry and academia and include recommendations of best practices which can be adapted for different domains and platforms.”

Their work examines service portability for a private cloud deployment. Storage, backup and data migration and data recovery are all addressed. The paper presents a detailed case study about cloud storage and bioinformatics services developed as part of the Cloud Computing Adoption Framework (CCAF). In order to illustrate the benefits of CCAF the authors provide several bioinformatics examples, including tumor modeling, brain imaging, insulin molecules and simulations for medical training. They believe that their proposed solution offers cost reduction, time-savings and user friendliness.

]]>A team of Croatian researchers published a paper in the January 2013 edition of Scientific-professional Journal of Technical Faculties of University in Osijek examining the present state of high-performance computing in the cloud with an emphasis on currently-available solutions.

In spite of these many advantages, however, HPC cloud still faces an uphill battle as it seeks to address technical and cultural barriers. Accordingly, the majority of HPC solutions continue to be the traditional, earth-bound variety. The biggest challenge is I/O-related. The slow network speeds and commodity interconnects that characterize most clouds present significant bottlenecks to data-intensive applications. But there are additional contributing factors underlying the slow rate of adoption, the authors observe. Virtualization, for example, adds a layer of complexity that is an anathema to the most latency-sensitive HPC workloads, while a growing reliance on co-processors presents its own set of challenges for HPC scaling.

HPC vendors are in the interesting position of being able to either support or retard the HPC cloud model. When it comes to developing cloud-friendly software models, the major ISVs have dragged their feet, not wanting to risk the cannibalization of proven income streams. But there are signs of momentum with most vendors now offering some level of cloud offering.

The researchers make the case that HPC vendors have indeed begun offering “fully functional HPC cloud solution[s].” In light of this, they recommend a set of helpful questions for would-be adopters:

Where does virtualization fit in?

Which type of clouds do they support: private, public or both, i.e., hybrid clouds?

How well do HPC applications scale on their cloud solutions?

The paper includes an overview of several current first, second and third-tier HPC cloud solutions.

The authors hope that their work will act as a “helpful compass for someone trying to shift from standard HPC to large computations in cloud environments.”

]]>A team of Croatian researchers published a paper in the January 2013 edition of Scientific-professional Journal of Technical Faculties of University in Osijek examining the present state of high-performance computing in the cloud with an emphasis on currently-available solutions.

In spite of these many advantages, however, HPC cloud still faces an uphill battle as it seeks to address technical and cultural barriers. Accordingly, the majority of HPC solutions continue to be the traditional, earth-bound variety. The biggest challenge is I/O-related. The slow network speeds and commodity interconnects that characterize most clouds present significant bottlenecks to data-intensive applications. But there are additional contributing factors underlying the slow rate of adoption, the authors observe. Virtualization, for example, adds a layer of complexity that is an anathema to the most latency-sensitive HPC workloads, while a growing reliance on co-processors presents its own set of challenges for HPC scaling.

HPC vendors are in the interesting position of being able to either support or retard the HPC cloud model. When it comes to developing cloud-friendly software models, the major ISVs have dragged their feet, not wanting to risk the cannibalization of proven income streams. But there are signs of momentum with most vendors now offering some level of cloud offering.

The researchers make the case that HPC vendors have indeed begun offering “fully functional HPC cloud solution[s].” In light of this, they recommend a set of helpful questions for would-be adopters:

Where does virtualization fit in?

Which type of clouds do they support: private, public or both, i.e., hybrid clouds?

How well do HPC applications scale on their cloud solutions?

The paper includes an overview of several current first, second and third-tier HPC cloud solutions.

The authors hope that their work will act as a “helpful compass for someone trying to shift from standard HPC to large computations in cloud environments.”

]]>http://www.hpcwire.com/2013/02/06/hpc_clouds__present_and_future-2/feed/0Free Lectures: Cloud, Virtualization, MapReduce and Morehttp://www.hpcwire.com/2013/01/07/free_lectures_cloud_virtualization_mapreduce_and_more/?utm_source=rss&utm_medium=rss&utm_campaign=free_lectures_cloud_virtualization_mapreduce_and_more
http://www.hpcwire.com/2013/01/07/free_lectures_cloud_virtualization_mapreduce_and_more/#commentsMon, 07 Jan 2013 08:00:00 +0000http://www.hpcwire.com/?p=8683Several lectures from the VSCSE Summer School on Science Clouds are now available for viewing on YouTube.

]]>Several lectures from the VSCSE Summer School on Science Clouds (July 30, 2012) are now available for viewing on YouTube. The presentations provide a clear and concise overview on the state of cloud and virtualization technologies with a particular focus on MapReduce.

These free, online lectures are part of the MOOC movement – referring to massive open online course. MOOCs are the product of an open education ethic that is characterized by the features of open access and scalability.

There are currently four “Cloud Computing MOOC” lectures are available for viewing. In the first one, Professor Geoffrey Fox introduces the Indiana University Cloud MOOC. In addition to laying out the agenda, Fox provides examples of the applications that are best-suited for clouds, most notably those that are “pleasingly parallel.” He highlights several science projects, for example FutureGrid, that are using cloud-based technologies, but also alludes to a lot of untapped potential.

Fox points to some interesting future possibilities. For example, it is projected that 24 billion devices will be connected to the Internet by 2020. This Internet of Things will rely on cloud for control and management functions. More and more, computing will look like a grid or mesh that touches nearly every aspect of our lives. The ability to offload computational tasks to the cloud will also enable advances in mobile computer devices and robotics.

Life science is another major vertical when it comes to cloud technology. Assistant Prof. Michael Schatz of the Simons Center for Quantitative Biology lectures on the use of cloud computing in genetic sequencing. Schatz is known for having produced some highly-sophisticated uses of MapReduce for biology applications. MapReduce was developed at Google for big data computations. It is a proprietary framework, but thanks to a 2004 paper, there are now open source implementations, most notably Hadoop.

Schatz notes that “Google every single day does the equivalent of a year’s worth of sequence analysis.” Traditional servers are no longer sufficient to handle such enormous data loads, but that’s where parallel computing technologies like MapReduce come in. Schatz gives an overview of the benefits and challenges of Hadoop and MapReduce before delving into specific implementations.

In the next video series, Professor J. Hacker argues that there is a growing need for virtualization in HPC. He explains the motivation for this conclusion is threefold: the clock speed increases following Moore’s law have ceased; hardware is going to multicore (example Intel MIC); and memory capacity of systems is increasing (512 GB on systems today). He notes that the traditional approach is to tie a single application to a single server. With 50-plus cores, this approach is no longer effective. Virtualization technology is being used to partition large scale servers to run many operating systems and VMs independent of each other.

The entire lecture is less than one hour long and provides an overview of virtualization and cloud technology in relation to HPC and then offers some practical advice for leveraging virtual HPC clusters. Hacker refers to cloud computing as the “distributed computing of this decade.” He views cloud as a computing utility that provides services over a network that “pushes functionality from devices at the edge (e.g. laptops and mobile phones) to centralized servers.”

In the last video series, Jonathan Klinginsmith, a PhD candidate at the School of Informatics and Computing at Indiana University, speaks about virtual clusters, MapReduce and the cloud. He covers such important questions as “Why is cloud interesting?” (hint: scalability, elasticity, utility computing).

While Klinginsmith’s main research interest is machine learning and artificial intelligence, he has turned to computer science and information systems to address the problem of growing data sets. He is not alone. Researchers from nearly scientific endeavor are finding it necessary to attain some degree of computational proficiency.

Klinginsmith aims his talk primarily at these non-computer scientists. Thus his presentation focuses mainly on running applications on top of clusters rather than getting too deep into the nuts and bolts of building and operating clusters. For anyone who is just getting started with Hadoop or MapReduce, this will be a valuable resource. In under an hour, the viewer should acquire a basic understanding of MapReduce, virtual machines, clusters, cloud and virtualization.

]]>http://www.hpcwire.com/2013/01/07/free_lectures_cloud_virtualization_mapreduce_and_more/feed/0VMware, EMC Spinoff Resurfaceshttp://www.hpcwire.com/2012/12/04/vmware_emc_spinoff_resurfaces/?utm_source=rss&utm_medium=rss&utm_campaign=vmware_emc_spinoff_resurfaces
http://www.hpcwire.com/2012/12/04/vmware_emc_spinoff_resurfaces/#commentsTue, 04 Dec 2012 08:00:00 +0000http://www.hpcwire.com/?p=8692VMware and EMC spinoff rumors, which came to light back in July, are now official.

]]>VMware and EMC spinoff rumors, which came to light back in July, are now official. Earlier today, Terry Anderson, VP of Global Corporate Communications at VMware published a blog entry announcing that the companies’ non-core cloud assets would be combined as part of the newly formed Pivotal Initiative.

“VMware and EMC are committing key existing technology, people and programs from both companies focused on Big Data and Cloud Application Platforms under one virtual organization – the Pivotal Initiative,” writes Anderson.

The new organization will include “most employees and resources currently working within EMC’s Greenplum and Pivotal Labs organizations, VMware’s vFabric (including Spring and Gemfire), Cloud Foundry and Cetas organizations as well as related efforts.”

All told, approximately 600 employees from VMware and 800 employees from EMC will make the switch.

The new subsidiary will be led by Paul Maritz, Chief Strategy Officer for EMC, and former CEO of VMware. The companies expect to formalize the union by mid-2012, subject to regulations.

The companies’ decision to create one uber-focused cloud and big data play speaks to the growing cloud market and the dominance of Amazon and first-run competitors, Google and Microsoft. All three vendors have doubled down on their cloud commitments in recent months; and on the open source side, there’s a bevy of OpenStack (and CloudStack) based offerings chomping at their respective bits, as well.

The move will allow VMware to focus on its primary business, server virtualization and software-defined networking and could give EMC greater leverage on the cloud storage front. All in all, it’s a good move, but will require expert direction and management to pull off.

]]>What if you could combine the benefits of virtualization, grid and cloud computing to accelerate Windows-based applications? An Israeli company, Xoreax, is doing just that. We spoke with Xoreax at SC12 in Salt Lake City, Utah, last week to learn more about their offering.

Xoreax got its start back in 2002 and for the last 10 years, they’ve been accelerating software in the Windows environment, using distributed, aka grid, computing technology. Their IncrediBuild-XGE (Xoreax Grid Engine) software uses a unique technology called process level virtualization to create a virtual HPC machine. The software harnesses unutilized CPUs to create a private grid. It runs on a company’s existing Windows infrastructure and extends into the public cloud if more resources are required.

“It’s like having virtual HPC – every workstation in this network can scale out to tap into the amount of resources that is in the entire network,” says CTO Dori Exterman.

“We have an agent in every machine and create a private grid out of all these machines and can also scale out to the cloud. The solution is currently Windows-only, and there aren’t many Windows solutions for this kind of cycle scavenging.”

Unlike other grid and virtualization solutions, IncrediBuild works out of the box. Traditional grid solutions require the user to change the source code and target an API. This extra work makes sense for very demanding applications like real-time financial trading, but Xoreax is targeting more general-purpose workloads that are also process-intensive.

The solution combines grid and virtualization, but instead of virtualizing entire machines like VMware does, the technology virtualizes a single process.

“We can take a process and virtualize it on the network. When the process needs some files from the local machine, the software acts like a middleman between the process and the operating system and we can take all that environment that the process requires on demand from the initial machine, cache it on the remote machine and give the process everything it needs in order to work on the remote machine,” explains Exterman.

“The process does not care where it’s running because we will create all the environment for it. You don’t need to install any software on the remote machines,” he adds.

Traditional HPC usually requires costly investments of time and resources in hardware, training and expertise, whereas process virtualization leverages the customers own resources. Potential customers are interested in ease of implementation and integration and they don’t want to be vendor-bound, the CTO notes, adding that integration takes less than an hour and requires only a small 2-line xml file.

Perhaps owing to Xoreax’s niche as a code-build accelerator, the company is not that widely known despite healthy deployment numbers. Over 100,000 users from some 50 countries rely on XGE-based products, and customers include big-name corporations like Cisco, NVIDIA, Google, Microsoft, eBay and IBM.

Under the OEM model, a software vendor integrates with IncrediBuild-XGE and then provides the product to their end users. One such example is the large diamond grading outfit Sarin, which uses the software to accelerate diamond analysis for tens of thousands of customers. Instead of running all the simulations on a single dedicated server they can scale out across all their in-house resources and into the cloud. With the XGE client, Sarin has achieve a 20x speedup in system performance. The time required for complex analyses has shrunk from 40-60 hours down to 2-4 hours, while simpler 1-2 hour analyses now complete in 10-15 minutes.

As with most grid or cloud solutions, IncrediBuild is intended for applications that are compute-intensive as opposed to I/O-intensive. It’s well-suited for computational tasks that take a lot of time, such as batch processing or near real-time. Gaming, financial services and manufacturing are some key verticals.

While early use cases revolved around computation acceleration, the software development space has evolved into their main customer base. IncrediBuild reduces the time of many development environment tasks, including code analysis, unit testing, QA script, stress tests and code verification. The company states it is the most popular off-the-shelf accelerator for Visual Studio code builds.

Another interesting IncrediBuild-XGE use case is the follow-the-moon scenario, where two manufacturing centers located in opposite hemispheres distribute their resources to balance workloads.

On the OEM side, IncrediBuild-XGE has added support for GPUs. This capability offers a lot of potential to engineers working with graphics-intensive workloads, like rendering or CFD. Running an application on a remote server with a dedicated discrete GPU can take minutes instead of hours versus running on a local workstation with an on-board GPU.

IncrediBuild-XGE is compatible with virtually any IaaS offering. Because the engine works with VMware, it runs everywhere you can mount a machine and have VMware run it as part of your subnet. During times of high demand, users can add virtually unlimited resources to their IncrediBuild grid.

On the subject of competition, both the CEO Eyal Maor and the CTO respond that the process virtualization technology is unique and that there aren’t really any comparable offerings. When I inquire about Windows-based grid vendor Digipede specifically, they acknowledge the similarities, but state that they “don’t run into them.”

As for their future roadmap, Xoreax doesn’t want to reveal too much, but the company hints at several announcements that are coming down the pike that could significantly expand their user ecosystem, including plans around Linux and tablets.

Xoreax offers separate pricing terms for its OEM and off-the-shelf products. The cost of the OEM version is negotiated on a per-customer basis, while the off-the-shelf pricing is per agent and based on the total number of cores that the agent is using. The company also offers different product packages, depending on the type of acceleration desired. Interested users can download a trial license, which gives them a chance to kick the tires and decide if the solution is right for them.