HPC in the Cloud Features – HPCwirehttps://www.hpcwire.com
Since 1987 - Covering the Fastest Computers in the World and the People Who Run ThemFri, 09 Dec 2016 21:51:05 +0000en-UShourly1https://wordpress.org/?v=4.760365857Security — The Dark Side of the Cloudhttps://www.hpcwire.com/2010/01/25/security_-_the_dark_side_of_the_cloud/?utm_source=rss&utm_medium=rss&utm_campaign=security_-_the_dark_side_of_the_cloud
https://www.hpcwire.com/2010/01/25/security_-_the_dark_side_of_the_cloud/#respondMon, 25 Jan 2010 08:00:00 +0000http://www.hpcwire.com/?p=5523Despite the many benefits to cloud computing, security remains one of the biggest challenges. IT organizations have a hard enough time defending their in-house private cloud resources. Companies offering public cloud, pay-for-usage models are faced with a more difficult challenge since they must serve multiple organizations on the same platform. In response to security threats, there is an opportunity for innovation of flexible cloud-based security service offerings.

]]>Cloud computing is a new computing paradigm for many but for the rest of us it is simply today’s version of timesharing – Timesharing 2.0. On-demand or pay-for-usage has been the norm for many HPC organizations for several decades. These users either never had the budget for their own computing resources or the project only needed limited access to powerful compute resources.

In that sense, HPC users, like the biggest commercial users, already trust cloud computing with their proprietary applications, data and results. They have been using pay-for-usage services for years and many have evolved from the early days of timesharing. In many cases, supercomputing centers and government research labs provide the compute resources. If you are a commercial HPC user in oil & gas, financial services, manufacturing or other industry, the compute resources will probably be found in the corporate datacenter.

This HPC user community has pioneered the necessary tools to allocate, measure and control access to specific users and projects while protecting the users from unauthorized access or modification of applications and data or malicious erasure or premature disclosure of results. This community also developed sophisticated accounting and charge back software that kept track of everything from CPU cycles to memory usage to access time and storage used. Suffice it to say that HPC users are well ahead of their counterparts in the commercial datacenter, and the latter would do well to look toward the former for some guidance in this area.

Without a doubt, the biggest challenge to cloud computing is security – the dark side of the cloud. In the cloud paradigm, the user community does not or should not care about the physical side of business operations. In most cases, the physical infrastructure is housed, managed and owned by a third party, and you pay for resources used just like the electric and gas utilities. Despite all these wonderful capabilities and features, security remains as much of a concern for the HPC community as it does for the consumers concerned about protecting their identity and credit card information.

Imagine for a moment the business ramifications if the results of critical drug data or aircraft design were changed and compromised by malicious activity or they were released to the world prematurely. The real and intangible costs to your company can be devastating.

Threats to the network and information security have been occurring for decades, nothing new. However, the complexity and scale of attacks are rising at an alarming rate, presenting organizations with a huge challenge as they struggle to defend against this ever-present threat. Today, cybercrime is more lucrative and less risky than other forms of criminal activity. Threat levels and attacks are on the rise, striking more and more businesses. Estimates for disruption, data theft and other nefarious activities were pegged at a staggering $1 trillion for 2008. Certainly more than a round-off error!

Just this month, the news headlines in CNET News include “Google China insiders may have helped with attack” and in the Wall Street Journal: “Fallout From Cyber Attack Spreads.” CCTV.com reported, “China’s largest search engine paralyzed in cyber-attack….” And a ZDNet headline on Jan. 21 read: “Microsoft knew of IE zero-day flaw since last September.”

The low risk and low-cost of entry of cyber-crime make it an attractive and lucrative “business.” Cloud-based computing exacerbates the situation by facilitating access to increasing amounts of information. IT organizations have a hard enough time defending their in-house private cloud resources. Companies offering public cloud, pay-for-usage models are faced with a more difficult challenge since they must serve multiple organizations on the same platform. At the same time, there is an opportunity for innovation of flexible cloud-based security service offerings.

The criminal element employs powerful tools such as botnets, enabling attackers to infiltrate large numbers of machines. The “2009 Emerging Cyber Threats Report from Georgia Tech Information Security Center (GTISC)” estimates that botnet-affected machines may comprise 15 percent of online computers. Another report compiled by Panda Labs estimates that in the second quarter of 2008 10 million botnet computers were used to distribute spam and malware across the Internet each day. With the growth of the cloud paradigm, more and more mission critical information will flow over the Web to publicly-hosted cloud services. The conventional wisdom of defending the perimeter is insufficient for this dynamic distributed environment. One element in common across commercial enterprise applications is that users must consider security before signing up for public cloud services.

During SC09, I met with many of the HPC infrastructure vendors and also spoke with some real-world HPC cloud users about the concerns they have using cloud computing for their workloads. (This was not a structured industry survey.) Some did express concerns about security but mainly in the context of using public cloud resources versus their private cloud resources. However, they also expressed concerns about transitioning their HPC workloads from in-house resources to external public cloud resources, as it is a very different scenario and from commercial workloads. From a security standpoint the concerns ranged from unauthorized access to exposure of critical information to malicious activity. Additional concerns include the movement and encryption of data to public clouds and the subsequent persistence once workloads have been completed. Has the data really been deleted? It is all about the data integrity.

HPC users often have many options available for running their workloads. For example, an academic user may have access to in-house central computing resources shared between multiple departments, or even access to large-scale supercomputing centers. In this environment the user data, results and applications are still very much ‘in-house’ and even though there is some security risk, the users are better protected in this environment. HPC users in private industry, especially those in large-scale multinational companies, may have the option of private clouds available for their workloads, and like HPC academic users, have fewer security concerns. However, if the HPC user is looking at commercial third-party cloud providers of public clouds, whether it is Amazon’s EC2, Google’s App Engine or better still, HPC-specific cloud vendors, they should spend the time to ensure that these vendors fully address their security issues, encryption, and persistence.

To those organizations that do not have internal private clouds and want to use cloud computing from a third party vendor, I recommend you consider the following five security evaluation criteria:

Evaluate the vendor’s security features very carefully. Ensure that they provide more than just password-protected access.

Look into the collaboration tools and resource sharing to prevent data leakage. Security is all about the data.

Look into authentication and the basic infrastructure security. What happens in the event of a disaster? What’s their disaster recovery plan, backup procedures and how often do they test this process? Has the provider ever had a failure or security breach and if so what happened?

Can they build a private cloud for your workload? What’s their data persistence policy? Can they guarantee data transfer security form in-house resources to public cloud?

Ask to review their best practices policy and procedures and check to see if it includes security audits and regular testing.

Cloud computing is not so much a new technology as it is a new delivery model, but its impact will be enormous. Research firm IDC estimated that worldwide cloud services in 2009 were $17.4 billion, and are forecasted to grow to $44.2 billion in 2013. The economies of scale and centralized resources create new security challenges to an already stressed IT infrastructure.

This concentration of resources and data will be a tempting target for cyber criminals. Consequently, cloud-based security must be more robust. Spend the time to evaluate the security and make sure it is designed in and not added on after a breach. Partner with a trusted vendor. And if in doubt, seek advice.

About the Author

Steve Campbell, an HPC Industry Consultant and HPC/Cloud Evangelist, has held senior VP positions in product management and product marketing for HPC and Enterprise vendors. Campbell has served in the vice president of marketing capacity for Hitachi, Sun Microsystems, FPS Computing and has also had lead marketing roles in Convex Computer Corporation and Scientific Computer Systems. Campbell has also served on the boards of and as interim CEO/CMO of several early-stage technology companies.

]]>https://www.hpcwire.com/2010/01/25/security_-_the_dark_side_of_the_cloud/feed/05523The Impact of Cloud Computing on Corporate IT Governancehttps://www.hpcwire.com/2010/01/25/the_impact_of_cloud_computing_on_corporate_it_governance/?utm_source=rss&utm_medium=rss&utm_campaign=the_impact_of_cloud_computing_on_corporate_it_governance
https://www.hpcwire.com/2010/01/25/the_impact_of_cloud_computing_on_corporate_it_governance/#respondMon, 25 Jan 2010 08:00:00 +0000http://www.hpcwire.com/?p=5531While cloud computing is enabling some fundamental changes on how IT groups deliver services, from a corporate management viewpoint, the basic principles of IT governance still remain true. However, the advent of cloud computing is having an increasing impact on how the components of the governance process are executed. For the purpose of this article, we will use the COBIT model (Control OBjectives for Information and related Technology) that is comprised of five major process focus areas: Strategy Alignment, Value Delivery, Resource Management, Risk Management, and Performance Measurement.

]]>This is the second in a series of articles discussing the impact of cloud computing on IT governance. The first article dealt with more informal internal IT processes while this article examines clouds impact from the formal management IT governance/steering committee aspect.

While cloud computing is enabling some fundamental changes on how IT groups deliver services, from a corporate management viewpoint, the basic principles of IT governance still remain true. However, the advent of cloud computing is having an increasing impact on how the components of the governance process are executed. For the purpose of this article, we will use the COBIT model (Control OBjectives for Information and related Technology) that is comprised of five major process focus areas: Strategy Alignment, Value Delivery, Resource Management, Risk Management, and Performance Measurement.

Governance at its core is the effective management of the IT function to ensure that an organization is realizing maximum value from its investments in information technology. Many companies, especially those with considerable IT budgets, have implemented significant internal IT governance procedures to manage their IT investment portfolio. This governance function provides the processes and framework for the management team to analyze, understand, and manage the level of return on the organizations technology investments. Industry studies show that on average, companies with effective IT governance processes in place average 5-7 percent less in equivalent IT spend to deliver the same functionality as compared to those companies that do not.

Any proper IT governance function also requires active management participation, the proper forum to make IT related decisions, and effective communication between the IT organization and the company’s management team. While these factors are critical to creating a successful IT governance function, there are five essential areas of process focus as spelled out in the COBIT model, which are described here:

Strategic Alignment: This focuses on ensuring the linkage of business and IT plans; defining, maintaining and validating the IT value proposition; and aligning IT projects and operations with enterprise operations.

Value Delivery: This is about executing the value proposition throughout the delivery cycle, ensuring that IT delivers the promised benefits against the strategy, concentrating on optimizing costs and proving the intrinsic value of IT.

Resource Management: This is about the optimal investment in, and the proper management of, critical IT resources: applications, information, infrastructure and people. Key issues relate to the optimization of system knowledge and technical infrastructure.

Risk Management: This requires risk awareness by senior corporate officers, a clear understanding of the enterprise’s appetite for risk, understanding of compliance requirements, transparency about the significant risks to the enterprise and embedding of risk management responsibilities into the IT organization.

If the IT governance framework isn’t implemented and managed correctly, this can adversely impact how well IT delivers on its commitments to its customers along with how IT is perceived within the organization. Lack of effective IT strategy, governance and oversight can cause continued issues with project overruns or even outright failures, project stakeholder dissatisfaction, and reduced business value received in relation to the resources expended. Companies that properly manage their IT function operate with a higher level of certainty that they are receiving an appropriate level of value from their investments in information technology. They also have the ability to ensure that the IT group is working on the projects that provide the most business value to the organization.

Now that we have discussed the impact of cloud computing on the IT group, let’s examine how cloud computing effects the five governance factors as defined in the COBIT model.

Value Delivery: Under the pre-cloud provisioning model, most new projects included costs for hardware to support the application and usually for testing and development environments also. IT was also guilty of over-buying hardware to ensure that if there were performance issues they were at least not hardware-related and to provide capacity for peak loads that might never materialize. Cloud computing offers several options that can change the cost model and free up more of the IT budget for innovation and not for under-utilized hardware and associated support. One option would be to provision test and QA instances via the cloud instead of purchasing additional servers or to shift peak loads to the cloud instead of maintaining that capacity internally. Cloud-based tools could also enable rapid prototyping, allowing for quicker delivery of business applications. With the potential cost savings, projects that were cost prohibitive may now be viable or funds freed up to support additional projects. Certainly some of these issues can be addressed using virtualization but cloud gives the IT group another tool in its tool kit to attack business problems. With the right strategy and mix of technologies, the IT group can deliver more value for potentially less money. There is one caveat. In order to ensure that proper value is being delivered, the IT organization needs to have a firm grasp on its internal cost structure as mentioned above in order to correctly drive investments.

Resource Management: One of the challenges in any IT group is appropriately managing the resources as its disposal to provide as much business value as possible. Cloud computing can impact the resources available to IT in a variety of ways. From a personnel standpoint, cloud will require a shift in operational skill sets from a more internally focused system services mentality to a more holistic system viewpoint oriented around delivering business value and not system infrastructure. IT staff will need to have increased knowledge of the value chain in the business to better understand where cloud technologies can fit in and to also recognize where they are not appropriate. IT management should include a plan to deal with the personnel skills changes required and incorporate that into any overall cloud adaptation strategy. Cloud can also impact system resources by requiring additional network bandwidth, monitoring tools, or other items to appropriately manage and maintain this new hybrid environment.

Risk Management: This is one of the most critical areas of governance impacted by cloud computing. Critical questions arise when cloud computing is brought into the existing IT ecosystem. These questions include those oriented to data protection and business continuity such as, impact to existing disaster recovery plans, how backups/restores and data archival policies are effected, and how are any business continuity plans effected. IT management must have a clear understanding of risk related to vendor service levels, strategies for mitigating that risk and how any potential outages would impact the business. IT also must examine security access and potential risks from putting corporate data into the cloud and what the potential impact might be on the business if data is lost or access control is breached. Other risks that need to be addressed revolve around the viability of the vendor, long-term prospects of any particular technology, and the impact to the existing IT infrastructure. All these questions and more must be asked and addressed, particularly as cloud computing is embraced for more critical business applications and IT services.

Performance Measurement: This area looks at the overall achievement of the IT organization. While cloud does not directly impact the purpose of this portion of the governance process, it does modify some aspects of the underlying key performance measures. Performance measurement is directed at providing management with information on how the IT group is performing outside of conventional accounting measures such as project completion, resource usage, service delivery, and user support metrics. While not integral to the adoption of cloud computing, the setting of governance goals and objectives should take into account the impact of using cloud resources. This could include completing projects quicker by provisioning resources via the cloud or using cloud resources to speed prototyping, or higher efficiencies in using funding and personnel resources by leveraging cloud capabilities. IT organizations will need to review and adjust their metrics and measurements and adjust accordingly.

Strategic Alignment: The primary goal of IT governance is to ensure alignment with organizational objective, cloud computing would not have a significant impact on this area of the IT governance process. Regardless of the technical architecture being proposed for a project, the management team needs to maintain the linkage of business goals and IT plans and ensure that IT projects and operations align with the enterprise needs.

Conclusion

Effective governance is a critical process and is key to maximizing the value any organization receives from its investment in IT. To take full advantage of what cloud computing can provide, IT organizations need reevaluate their corporate governance procedures and adapt them as necessary. For those companies willing to invest in the appropriate governance processes, the future looks bright; for those not ready or willing, the future looks cloudy indeed.

About the Author

Bruce Maches is a 32-year IT veteran and has worked or consulted with firms such as IBM, Pfizer, Eli Lilly, SAIC, and Abbott. He can be reached at bmaches@rfittech.com.

]]>https://www.hpcwire.com/2010/01/25/the_impact_of_cloud_computing_on_corporate_it_governance/feed/05531The Impact of Cloud Computing on Internal IT Governancehttps://www.hpcwire.com/2010/01/25/the_impact_of_cloud_computing_on_internal_it_governance/?utm_source=rss&utm_medium=rss&utm_campaign=the_impact_of_cloud_computing_on_internal_it_governance
https://www.hpcwire.com/2010/01/25/the_impact_of_cloud_computing_on_internal_it_governance/#respondMon, 25 Jan 2010 08:00:00 +0000http://www.hpcwire.com/?p=5537In theory, the realization of IT governance should be a seamless process running from the board room to actual delivery of IT services. In practice however, many organizations have an institutionalized steering committee with its associated processes that supports the organization's IT related goals while the IT group has a leadership team or council whose governance activities are focused inward on such things as technical issues, standards, and resource management.

]]>This is the first of two articles dealing with the impact of cloud computing on the governance of the IT function. What follows is a description of the less formal internal IT governance mechanisms found in many organizations. The second article is directed at the more formal corporate IT governance steering committees and associated practices.

In theory, the realization of IT governance should be a seamless process running from the board room to actual delivery of IT services. In practice however, many organizations have an institutionalized steering committee with its associated processes that supports the organization’s IT related goals while the IT group has a leadership team or council whose governance activities are focused inward on such things as technical issues, standards, and resource management.

All IT organizations must navigate an ever changing sea of user priorities, vendor offerings, business needs, regulatory requirements and changing technologies while at the same time, delivering the systems and applications required to support the operations of the business. Those IT organizations that have the ability to manage increasingly complex environments are the one’s that are better suited to leverage new technologies, such as cloud computing. As company’s determine how to best use cloud technology those organizations with mature IT governance functions and clear IT architectural strategies will be better suited to take advantage of cloud computing and to integrate its capabilities into their existing technology infrastructures.

What is IT Governance?

IT governance involves putting defined processes around how IT resources are directed and how organizations align IT efforts with business strategy. Any IT governance framework should answer some key questions, such as how the IT department is functioning overall, what key metrics management needs to manage IT efforts, and what return the organizations IT investments are providing. Every company needs a defined process to ensure that the IT function supports the organization’s overall strategies and that IT activities are directly linked to corporate objectives.

Impact of Cloud Computing

Any potentially disruptive technology, such as cloud computing, can have a significant impact on how IT designs, creates, and delivers applications and systems to the enterprise. This impact can be either positive or negative depending on how receptive the IT organization is to change and its ability to formulate plans to leverage new technologies and incorporate them into its project portfolio.

In order for an IT department to be ready to take full advantage of what cloud computing, or any new technology can offer, it needs to have in place the following internal focused processes and capabilities.

Understanding the Costs to Deliver Services: The IT group needs to have a thorough understanding of the services it provides and the associated costs to deliver those services both for internal and external needs. An example of this would be for the IT department to recast its budget into a catalog of services and associated costs. This service catalog could include items such as the cost to provision and deliver a new server or to set up a new database instance. Without an understanding of the IT groups own “cost of goods,” there is no rational way to determine if a cloud approach for a particular project is more economical than using only internal resources.

Defined Application & Platform Standards: The IT group needs to have in place a clear set of application and platform standards that define the system environments it will provide and support, as well as sufficient checks built into the system life cycle to ensure that new projects follow these standards. This allows the IT group to reduce the complexity of the overall IT ecosystem and is a critical requirement to creating a rational cost structure. If the existing applications and systems are a hodge-podge of different platforms, operating systems, databases and vendors it will be very difficult to fully define the costs to deliver IT services. This also impedes the ability of the IT organization to create a coherent strategy for incorporating new technologies, such as cloud, into the existing IT environment while reducing the impact on existing operations.

Progressive IT Leadership: Efficiently integrating any new technology requires strong and forward looking leadership that is willing to seek out and evaluate new potentially valuable technologies. Strong IT leadership will help to drive the creation of a cohesive strategy for adopting cloud and to ensure that all areas within the IT group are on board with the direction the organization is taking.

Effective Internal IT Communications: Clear and consistent communications within the IT group is an integral function for any IT organization especially in regards to adopting a new technology such as cloud. The various groups within IT (operations, support, applications and administration) need to have consistent and clear communications with each other. This communication should include at a minimum the strategy for cloud usage, technical issues that might arise, existing or planned projects where cloud might be leveraged, and the long-term strategic planning directions for the IT group.

Comprehensive Risk Management Practices: Incorporating any new technology into an existing technology base carries inherent risk that must be effectively managed and dealt with. The IT group needs to have in place the risk management practices required to ensure that risks to existing operations and to project success are identified and mitigated. The IT group needs to look at risks on a different level than the overall enterprise. It must also include factors such as technology direction, the strength of the vendor, impact on existing systems, data security, access control, backup/recovery, and any potential disruptions to the operations of the business if there were a service interruption.

Conclusion

To properly take advantage of what cloud computing can provide, IT organizations need to take a hard look at their internal governance processes and adapt them as necessary. Those organizations that do not have the structure and disciplines in place to effectively leverage new technologies will always be behind the curve in delivering value and at a competitive disadvantage. Cloud computing not only raises the bar on how strategy and governance intersect it also requires centralized control to ensure that a coherent strategy is followed and risks are minimized.

The next article in this series examines the impact of cloud on more formalized IT governance functions using the COBIT model as a reference point.

About the Author

Bruce Maches is a 32-year IT veteran and has worked or consulted with firms such as IBM, Pfizer, Eli Lilly, SAIC, and Abbott. He can be reached at bmaches@rfittech.com.

]]>https://www.hpcwire.com/2010/01/25/the_impact_of_cloud_computing_on_internal_it_governance/feed/05537Timesharing 2.0https://www.hpcwire.com/2009/11/03/timesharing_2_0/?utm_source=rss&utm_medium=rss&utm_campaign=timesharing_2_0
https://www.hpcwire.com/2009/11/03/timesharing_2_0/#respondTue, 03 Nov 2009 08:00:00 +0000http://www.hpcwire.com/?p=5630Is cloud computing today's hot technology that promises to lower TCO, reduce energy costs, and enable dynamic, agile datacenters -- or is it just the latest hype? That is, will cloud computing really happen and will it deliver on its promises? And what does it mean for high performance computing?

]]>Cloud computing: Is there anything new to say? A fair question as it seems that hardly a week, or even a day, goes by without a new announcement about some new product or service for the “cloud.” When you read about cloud computing in The Economist, BusinessWeek or Forbes, you know something is really happening. Further evidence of this is the series of IBM prime time TV ads extolling the virtues of cloud computing. The technology has become mainstream.

One of the reasons business publications are writing about the cloud is because the technology is breaking out from its roots in high performance computing (HPC) and is being adopted for commercial applications. But is cloud computing today’s hot technology that promises to lower TCO, reduce energy costs, and enable dynamic or agile datacenters or is it just the latest hype? That is, will cloud computing really happen and will it deliver on its promises? And what does it mean for high performance computing?

Picture this: You’re sitting at a keyboard and you login to the system. Your ID is verified, which is good, and you begin to enter the data for your application need. When finished entering the data, the application begins executing your workload, along with many other users’ workloads. Eventually your workload completes and you receive the results together with a statement for CPU time, memory usage, disc I/O usage connect time, etc. A very comprehensive statement for all the services used. This method of access enables several other users to access the same system thus dramatically lowering the cost of computing, enabling organizations to use compute resources without owning them, and creating a development environment resulting in new applications being created.

Sound familiar? What I described was my experience using a computer system at a College in London, circa 1971. The era of timesharing had just begun. The computer system was in the datacenter (glass house) and utilized new technologies such virtualization, based on LPARs and domain, and workflow management.

Access today is from any Web-based device connected to the Internet; anytime, anywhere, any device has finally arrived.

The use of standards-based software, connectivity, etc., enables heterogeneous systems to co-exist within the same cloud.

Rich suites of management and middleware software and virtualization tools relieve the IT resource administration of the burden of managing this heterogeneous infrastructure and mapping workloads to infrastructure.

It’s that simple. Timesharing 2.0, better known as cloud computing, has arrived. Enough of the soapbox.

Cloud computing basics

Cloud computing is becoming ubiquitous and yet it is still evolving. Consequently, there is no accepted industry definition. Gartner defines cloud computing as “a style of computing in which scalable and elastic IT-enabled capabilities are delivered as a service to external customers using Internet technologies.”

Cloud computing is the provision of dynamically scalable and often virtualized resources as a service over the Internet on a utility basis. Users need not have knowledge of, expertise in, or control over the technology infrastructure in the “cloud” that supports them. Cloud computing services often provide common business applications online that are accessed from a web browser, while the software and data are stored on the servers.

The general consensus is that cloud computing has the following attributes:

Users can access their applications and data from any device connected to the Internet.

The concept generally incorporates a combination of the following:

Infrastructure as a Service (IaaS)

Platform as a Service (PaaS)

Software as a Service (SaaS)

It is frequently associated with virtualization and Web 2.0 technologies.

It exhibits elastic scaling – dynamic and fine grained.

Users can access large scale computing resources without making the heavy investment in IT infrastructure.

The huge benefit of cloud computing is that companies can access the latest IT infrastructure for their workloads without having to make the huge investment in infrastructure; they can simply pay-for-usage. This is good for everyone, but for small and economically strapped firms, it is especially attractive.

One of the key software technologies is virtualization. This is significantly different from Timesharing 1.0 where virtualization was proprietary and built into the hardware. Today virtualization is a fundamental technology that enables cloud computing resource provisioning, for example, in a heterogeneous environment. Based on industry standards and utilizing the x86 VT instruction to enhanced the performance and supports multiple operating systems. Hypervisor technology is enhanced by rich set of tools for from resource provisioning to live migration.

Delivery models

Cloud computing architects are faced with many decisions and choices when developing cloud deployment models. There are several different models that are accepted in the industry today:

Private Cloud: Operated solely by and for the organization.

Public Cloud: Available to the general public on a pay-for-usage model.

Hybrid Cloud: A composition of private and public clouds.

There are infrastructure delivery models for seasonal fluctuations, for example, at tax time. In such models, companies with private clouds open up part of their infrastructure, creating public clouds to manage seasonal traffic.

Trends

IT vendors will continue to evolve their product lines and develop more “marketingware” as they strive for defining their uniqueness, value add and messaging. Many of them need a lot of help in differentiating themselves.

But there are a number of offerings from existing vendors that are worth watching:

The datacenter-in-a-box or container. This is a self contained IT datacenter that is delivered in a container such as Sun’s Modular Datacenter or Verari’s FOREST Container. These container-based datacenters can provide almost instant datacenter capacity for today’s cloud computing infrastructure. Designed to be eco-friendly, cost effective, and flexible.

The traditional approach. Solutions like IBM’s Cloudburst, based on IBM’s BladeCenter, or HP’s BladeSystem Matrix are conventional blade designs that can serve as cloud infrastructure. These datacenter-in-a-rack solutions can help organizations drive down the complexity and growing operating costs in particular reduce their OPEX utilities cost by delivering true green computing solutions.

Management and middleware software. Simplifying the deployment and operation of hardware (servers, storage, and networking) is the critical glue that makes the cloud model possible. The model is dependent upon this software to hide the complexity of the underlying infrastructure for the end user. For the IT organizations that are building and delivering cloud services the benefits of rich software tools will ease their task while reducing time to deploy services and simplify management.

Security. The protection of data and algorithms is perhaps the biggest concern end users have regarding cloud computing. Cybercrime is on the rise despite efforts to thwart the hackers. As consumer technology, social networking and Web 2.0 continue their rapid adoption in the workplace building secure cloud IT infrastructure is becoming more and more difficult. The best advice here is to design in security before you start building and deploying services. Don’t wait for a breach in security before taking action. Do your research.

Service. We’re starting to see third party compute cycle brokers emerge. Nimbis Services, for example, connects its clients through an industry wide brokerage and clearinghouse with 3rd party compute resources, commercial application software and expertise. The goal is to reduce risk and provide pay-as-you-go. Match users with resources.

Hybrid architectures. Over the past three or four decades HPC computing has seen many architecture to solve complex scientific workloads, we’ve seen the big SMP nodes, vector supercomputers such as Cray and mini-supercomputers such as Convex change the price performance dynamics of HPC. We have also seen numerous MPP systems. The rise of powerful commodity chipsets changed the market forever and gave birth to the distributed cluster and Grid architectures, connected via high speed network fabrics. The one architecture that survived is, Symmetric Multi-Processor (SMP), where multiple CPUs access a large shared memory, typically ccNUMA, with a single OS instance. Today that architecture is at the chip level with the x86 chipsets form Intel and AMD being multicore and 64-bit, they are SMP on a chip.

For example, Convey Computer’s server architecture combines the familiar world of x86 computing with hardware-based, application-specific instructions to accelerate certain HPC applications. Another approach to hybrid computing is that provided by vendors such as 3Leaf Systems and ScaleMP. These solutions enables a group of x86 servers to look like one big SMP system with a single pool of CPU processing and memory that can be dynamically allocated and/or repurposed to applications as needed. Essentially it turns a distributed architecture into a ccNUMA SMP.

Storage and networking. Most analysts confirm that storage is doubling every eighteen months. HPC workloads, in particular, have huge storage needs that can stress the system. There are developments such as the recent Panasas and Penguin partnership to provide high-performance parallel storage and on demand services designed specifically for high performance computing. Amazon S3 (Simple Storage Service) is an online storage web service offered by Amazon Web Services providing unlimited storage through a simple web services interface.

In the network arena, InfiniBand continues to increase its market penetration due to lower price points and a more mature software ecosystem. More interesting, however, is that several vendors are now building InfiniBand capabilities into their HPC-focused cloud solutions.

The increase demand for network performance is driven by HPC application demands and the new generations of x86 chips are able to fully utilize 10 Gigabit Ethernet (10GigE). Performance demand coupled with increased volumes of data creates the perfect storm for 10GigE adoption. One final comment on networking is the expected growth in converged network adapters (CNA) and Fiber Channel over Ethernet (FCoE). Both these offer the benefits of reduced costs and higher throughput.

How big is the opportunity?

For the vendors of products and services, the growth opportunity is large and growing rapidly. In some cases, it is hard to get any attention to your offerings if you do not have the name cloud associated with the product or service.

At the International Supercomputing Conference (ISC’09) in June 2009, Platform Computing surveyed IT executives who attended the conference. Over a quarter (28 percent) of IT executives surveyed are planning to deploy private clouds in 2009. Increased workload demand of applications and the need for IT to cut cost are cited as two major factors behind the planned adoption of HPC clouds.

The traditional analyst firms that specialize in market sizing and growth are predicting a bright future for IT infrastructure and services in the cloud. One of the most recent forecasts is in an October 2009 IDC Exchange blog titled IDC’s New IT Cloud Services Forecast: 2009-2013. In this post, IDC is forecasting that “the five year growth outlook remains strong, with a five-year annual growth rate of 26 percent — over six times the rate of traditional IT offerings.” Full details will be published in the upcoming IDC’s Cloud Services: Global Overview.

The HPC connection

For the high performance computing space, there are a growing number of companies and organizations providing services that target the special needs of this group of users. Our companion article encapsulates the vendors that are addressing this market today.

The HPC research community is also on board. In February of this year, UC Berkeley researchers released a report (PDF) discussing the impact and future directions of cloud computing. It served as a one of the first academic treatises on the subject. Eight months later, the US Department of Energy launched a five-year, $32 million program to study how scientific codes can make use of cloud technology. That work will take place at the DOE’s Argonne and Berkeley national laboratories.

Conclusion

Cloud computing is not new; it is largely an evolution of IT infrastructure. The pay-as-you-go model of cloud computing has its roots in the timesharing era of 1970s. As such, we are seeing cloud computing grow from a promising business concept to one of the fastest growing segments of the IT industry.

Organizations with challenging workload profiles or recession-hit companies are realizing they can access best-in-breed applications and infrastructure easily quickly and on a pay-for-usage basis. This now includes HPC users, who are looking to the cloud to maximize their FLOPS per dollar.

About the Author

Steve Campbell, an HPC Industry Consultant and HPC/Cloud Evangelist, has held senior VP positions in product management and product marketing for HPC and Enterprise vendors. Campbell has served in the vice president of marketing capacity for Hitachi, Sun Microsystems, FPS Computing and has also had lead marketing roles in Convex Computer Corporation and Scientific Computer Systems. Campbell has also served on the boards of and as interim CEO/CMO of several early-stage technology companies.

]]>https://www.hpcwire.com/2009/11/03/timesharing_2_0/feed/05630Cloud Computing Opportunities in HPChttps://www.hpcwire.com/2009/11/02/cloud_computing_opportunities_in_hpc/?utm_source=rss&utm_medium=rss&utm_campaign=cloud_computing_opportunities_in_hpc
https://www.hpcwire.com/2009/11/02/cloud_computing_opportunities_in_hpc/#respondMon, 02 Nov 2009 08:00:00 +0000http://www.hpcwire.com/?p=5629High-end, public cloud computing offerings represent a convergence of grid and Internet technologies, potentially enabling workable new business models. Smaller, private clouds are a technical evolution that expands the ease of use and deployment of grids in more organizations.

]]>This article is excerpted from “Cloud Opportunities in HPC: Market Taxonomy,” published by InterSect360 Research. The full article was distributed to subscribers of the InterSect360 market advisory service and can also be obtained by contacting sales@intersect360.com.

In Life, the Universe, and Everything, the third book of Douglas Adams’ whimsical Hitchhiker fantasy trilogy, cosmic wayfarer Ford Prefect describes how an object, even a large object, could effectively be rendered invisible to the general populace by surrounding it with an “SEP field” that causes would-be observers to avoid recognizing Somebody Else’s Problem. “An SEP,” Ford helpfully explains, “is something we can’t see, or don’t see, or our brain doesn’t let us see, because we think that it’s somebody else’s problem.”

If we were to reinterpret SEP to stand for “Somebody Else’s Processing,” we would be well on the way to a definition of cloud computing.

The term “cloud” comes from the engineering practice of drawing a cloud in a schematic to represent an external resource that the engineer’s design will interact with — a part of the workflow that he or she will assume is working but that is not part of that specific design. For example, a processor designer might draw a cloud to represent a memory system, with arrows indicating the flow of data in and out of the memory cloud. Cloud computing takes this concept to an organizational level; entire sections of IT workflows can now be virtualized into resources that are someone else’s concern.

Cloud computing is therefore a new instantiation of distributed computing. It is built on grid computing concepts and technology and further enabled by Internet technologies for access. Cloud computing is the delivery of some part of an IT workflow — such as computational cycles, data storage, or application hosting — using an Internet-style interface. This definition includes Web-immersed intranets as conduits for accessing private clouds.

Cloud computing is currently driven by business models that attempt to utilize or monetize unused resources. Grid, virtualization, and now cloud technologies have attempted to find and tap idle resources, thus reducing costs or generating revenue. The most interesting difference between cloud computing and earlier forms of distributed computing is that in developing ultra-scale computing centers, organizations such as Google and Amazon incidentally built out significant caches of occasionally idle computing resources that could be made generally available through the Internet. Furthermore these organizations found that they had developed significant skills in constructing and managing these resources, and economies of scale allowed them to purchase incremental equipment at relatively lower prices. The cloud was born as an effort to monetize those skills, economic advantages, and excess capacity.

This is important because from a business model point of view the cloud resources came into existence at no cost, with minimal incremental support requirements. The majority of the costs are born by the core businesses, and therefore, at least initially, customers of the excess capacity do not need to foot the bill for capital expenditures. Costs associated with staff training, facilities, and development are similarly already fully amortized and absorbed by the parent businesses. There is little more appealing than being able to sell something that you get for free.

With such an appealing proposition in play, many other organizations are scrambling to see whether they have an infrastructure — public or private — that can be exploited for gain through cloud computing. However, when significant excess capacity does not exist, or if it cannot be leveraged in a timely or reliable fashion, it is not clear what sustainable business models exist for cloud computing.

High-end, public cloud computing offerings represent a convergence of grid and Internet technologies, potentially enabling workable new business models. Smaller, private clouds are a technical evolution that expands the ease of use and deployment of grids in more organizations.

As cloud computing technologies mature, InterSect360 Research sees several possible business models that could evolve. Although we emphasize High Performance Computing in our analysis, cloud computing transcends HPC, and similar models will exist in non-HPC markets.

Utility Computing Models

Cloud computing provides a methodology for extending utility computing access models. Utility computing is not new; it has been touted for several years as a way for users to manage peaks in demand, extend capabilities, or reduce costs. Traditionally, limitations in network bandwidth, security issues, software licensing models, and repeatability of results have acted as barriers to adoption, and all of these still need to be addressed with cloud.

There are four major variations on the potential utility computing models with cloud:

Cycles On Demand

The cycles-on-demand model is the most basic approach to cloud computing. The cloud supplier provides hardware and basic software environments, and the user provides application software, application data, and any additional middleware required. In this case users are simply buying access to computer processors, which they provision and manage as needed in order to run their applications, after which the resources are “returned” to the cloud provider. Users are charged for the time the resources are in use, plus possibly some overhead costs. The demands are relatively low on the cloud provider, and relatively high on the user in terms of making sure there is effective utility generated by the rented resources.

Storage Clouds

The storage cloud model complements the cycles-on-demand model both in terms of operational approach — users buy disk space at a cloud providers facility — and in terms of providing a more complete solution for cycles users — a place to put programs and data between job runs. In the storage-on-demand approach the cloud is used:

As the final (archival) stage in hierarchical storage management schemes (even if it is a two-level hierarchy: local disk and cloud). On the consumer side this is essentially the concept used for PC backup services.

A file-sharing buffer where users can place data that can be accessed at a later time by other users. This approach is at the heart of photo sharing sites, and arguably with social sites such as Facebook and LinkedIn. This same concept is also used for shared science databases in areas such as genomics and chemistry.

Software as a Service

Software as a service (SaaS) extends the basic cycles-on-demand model by providing application software within the cloud. This model addresses software licensing issues by bundling the software costs within the cloud processing costs. It also addresses software certification and results repeatability issues because the cloud provider controls both the hardware and software environment and can provide specific system images to users.

SaaS also has the advantages for providers of allowing them to sell services along with the software, and to use the cloud as demonstration platform for direct sales of software products. In addition, the user is able to turn much of the system administration task over to the provider. The major drawback to this strategy is that users generally run of a series of software packages as part of their overall R&D workflow, in such case data would need to be moved into and out of the cloud for specific stages of the workflow, or the cloud provider must support an end-to-end process.

Environment Hosting

Environmental hosting is the use of a service to support virtually all computational tasks, with servers, storage, and software all being maintained by a third party. This concept can include constructs such as platform as a service (PaaS) and infrastructure as a service (IaaS). Arguably environmental hosting in the cloud is an oxymoron, however, it represents the upper end of the utility computing spectrum and a logical destination of cloud strategies. This approach addresses software, result repeatability, and most networking issues by simply providing dedicated resources all in one (logical) place. It addresses many of the technical security issues, but not a consumer organization’s security problem of inserting a third party into the workflow process.

Cloud-Generated Markets

In addition to the models for those who would consume resources through the cloud, there are applications that are made possible by the combination of Internet communications and large computing resources. This is inclusive of the opportunities for organizations to become cloud computing service providers, either externally or internally. In addition, there is the potential for some secondary markets to be enabled by the adoption of cloud technologies.

Restructuring of Internet-Based Service Infrastructures

One of the most interesting aspects of cloud computing is that Internet companies with value-add and expertise in intellectual property or content (as opposed to purchasing, managing, and running computer hardware systems) could move their internal computing architecture to the cloud, while maintaining system management and operating control in-house. With this strategy an organization would move the bulk of its computing to the cloud keeping only what is necessary for communications and cloud management, in doing so they convert internal costs for systems, software, staff, space and power into usage fees in the cloud. Cloud technology and service providers facilitate and accelerate the industry’s evolution towards a network of interrelated specialty companies, as opposed to groups of organizations each performing the same set of infrastructure functions in house. The major issue potentially holding this model back would be cost; i.e., the level of premium users would be willing to pay for a service versus a do-it-yourself solution.

Personal Clouds

This strategy would replace personal computers with an advanced terminal that connected to a cloud utility that holds all of the user’s data and software. The advantage for users is that they would be relieved of the burden of purchasing, maintaining, and upgrading their personal systems. They would also have professional support for such task as system back-up and system security and would also be able to access their computing environment form any Web-connected device.

This strategy may represent the evolutionary future of the Internet, particularly as more devices become Web-enabled and the relationship between the Web and the personal computer is weakened by competing devices, such as smart phones. The main challenge to this model is overall bandwidth on the Internet. Side effects to such an evolution would replace the role of the operating system with a Web browser and whatever backend environment the cloud supplier chose to provide, also creating a new product class for Web terminals.

InterSect360 Research Analysis

We see cloud computing as part of the logical progression in distributed computing. It is not completely revolutionary, nor is it a panacea that will provide any service that can be imagined. The business models must be considered in terms of cost and control, barriers and benefits.

Of all the cloud business models, InterSect360 Research believes that SaaS has the highest potential for success within HPC. It addresses several of the major dampening factors associated with cloud and provides additional revenue opportunities in the services arena. It also targets industrial users, who would be the most likely to pay a premium for the product, without attempting to develop competing solutions. Furthermore companies can adopting SaaS models in cloud in a phased or tiered way, first proving the concept private clouds before giving themselves over to public or hybrid models. (This same phenomenon persists with private and public grids today.)

Organizations that have experience with the software and in house operations may look to SaaS options for peak load management and capacity extension. However, we believe the greater opportunity is for selling packaged cloud computing, software, and start-up services to companies testing HPC solutions. Our research indicates that there are major start-up barriers to using HPC solutions among small and medium companies. These barriers include finding the expertise for the creation of the organization’s first scalable digital models.

The major barrier for SaaS adoption in HPC is the fragmentation of the applications software sector of the industry. The boutique nature of the opportunity may indicate there is not sufficient volume to merit the ISV’s investment to create and market cloud-enable versions of their applications. Interestingly, in a recursive manner, small SaaS providers could theoretically tap into larger cycles-on-demand cloud providers to supply the computing resources.

Similarly, implementation of environment hosting within current cloud environments for HPC organizations would currently entail significant amounts of effort by the user organization to set up and manage storage and software environments. It would also be limited by software licensing issues for industrial users in particular. Thus market opportunities for this option are very limited at this time. That said, a small organization could conceivably do all its computing in the cloud, keeping all its data on cloud storage system, using only internally developed, open-source, or SaaS software, and trusting in small size as part of a herd to provide security.

Finally, we note that Web-based software services are not new to the market; they currently range from income tax preparation services to on-line gaming companies. SaaS fits into cloud markets based on the concept of work being sent to outside party and results returned, without the sender having knowledge of exactly how those results are generated. For some users, SaaS may inherently make sense. Ultimately the best way to help users adopt HPC applications may be to make them Somebody Else’s Problem.

]]>https://www.hpcwire.com/2009/11/02/cloud_computing_opportunities_in_hpc/feed/05629Grid Computing Done Righthttps://www.hpcwire.com/2009/11/02/grid_computing_done_right/?utm_source=rss&utm_medium=rss&utm_campaign=grid_computing_done_right
https://www.hpcwire.com/2009/11/02/grid_computing_done_right/#respondMon, 02 Nov 2009 08:00:00 +0000http://www.hpcwire.com/?p=5635Writing and implementing high performance computing applications is all about efficiency, parallelism, scalability, cache optimizations and making best use of whatever resources are available -- be they multicore processors or application accelerators, such as FPGAs or GPUs. HPC applications have been developed for, and successfully run on, grids for many years now.

]]>Writing and implementing high performance computing applications is all about efficiency, parallelism, scalability, cache optimizations and making best use of whatever resources are available — be they multicore processors or application accelerators, such as FPGAs or GPUs. HPC applications have been developed for, and successfully run on, grids for many years now.

HPC on Grid

A good example of a number of different components of HPC applications can be seen in the processing of data from CERN’s Large Hadron Collider (LHC). The LHC is a gigantic scientific instrument (with a circumference of over 26 kilometres), buried underground near Geneva, where beams of subatomic particles — called Hadrons, either protons or lead ions — are accelerated in opposite directions and smashed into each other at 0.999997828 the speed of light. Its goal is to develop an understanding of what happened in the first 10-12 of a second at the start of the universe after the Big Bang, which will in turn confirm the existence of the Higgs boson, help to explain dark matter, dark energy, anti-matter, and perhaps the fundamental nature of matters itself.

Data is collected by a number of “experiments.” each of which is a large and very delicate collection of sensors able to capture the side effects caused by exotic, short lived particles that result from the particle collisions. When accelerated to full speed, the bunches of particles pass each other 40 million times a second, each bunch contains 10^11 particles, resulting in one billion collision events being detected every second. This data is first filtered by a system build from custom ASIC and FPGA devices. It is then processed by a 1,000 processor compute farm, and the filtering is completed by a 3,400 processor farm. After the data has been reduced by a factor of 180,000, it still generates 3,200 terabytes of data a year. And the HPC processing undertaken to reduce the data volume has hardly scratched the surface of what happens next.

Ten major compute sites around the world comprising many tens of thousands of processors (and many smaller facilities) are then put to work to interpret what happened during each “event.” The processing is handled, and the data distribution managed, by the LHC Grid, which is based on grid middleware called gLite that was developed by the major European project, Enabling Grids for E-sciencE (EGEE). High performance is achieved at every stage because the programs have been developed with a detailed knowledge and understanding of the grid, cluster or FPGA that they target.

From Grid to Cloud

Grid computing isn’t dead, but long live cloud computing. As far as early-adopter end users in our 451 ICE program are concerned, cloud computing is now seen very much as the logical endpoint for combined grid, utility, virtualization and automation strategies. Indeed, enterprise grid users see grid, utility and cloud computing as a continuum: cloud computing is grid computing done right; clouds are a flexible pool, whereas grids have a fixed resource pool; clouds provision services, whereas grids are provisioning servers; clouds are business, and grids are science. And so the comparisons go on, but through cloud computing, grids now appear to be at the point of meeting some of their promise.

One obvious way to regard cloud computing is as the new marketing-friendly name for utility computing, sprinkled with a little Internet pixie dust. In many respects, its aspirations match the original aspirations of utility computing — the ability to turn on computing power like a tap and pay on a per-drink basis. “Utility” is a useful metaphor, but it’s ambiguous because IT is simply not as fungible as electrical power, for example. The term never really took off. Grid computing, in the meantime, has been hung up on the pursuit of interoperability and the complexity of standardization. Taking the science out of grids has proved to be fairly intractable for all but high performance computing and specialist application tasks.

Clouds usefully abstract away the complexity of grids and the ambiguity of utility computing, and they have been adopted rapidly and widely. Since then everyone has been desperately trying to work out what cloud computing means and how it differs from utility computing. It doesn’t, really. Cloud computing is utility computing 2.0 with some refinements, principally, that it is delivered in ways we think are very likely to catch on.

But as cloud abstracts away the complexity, it also abstracts away visibility of the detail underlying execution platform. And without a deep understanding of how to optimize for the target platform, high performance computing becomes, well, just computing.

Building Applications

Human readable programs are translated into ones that can be executed on a computer by a program called a compiler. A compiler’s first step is that of lexical analysis, which converts a program into its logical components (i.e., language keywords, operators, numbers and variables). Next, the syntax analysis phase checks that the program complies with the grammar rules of the languages. The final two phases of optimization and code generation are often tightly linked so as to be one and the same thing (although some generic optimizations such as common sub-expression elimination are independent of code generation). The more the compiler knows about the target systems, the more sophisticated the optimizations it can perform, and the higher the performance of the resulting program.

But if a program is running in the cloud, the compiler doesn’t know any detail of the target architecture, and so must make lowest common denominator assumptions such as an x86 system with up to 8 cores. But much higher performance may be achieved by compiling for many more cores, or an MPI-based cluster, or GPU or FPGA.

Such technology has become a hot commodity. Google bought PeakStream, Microsoft bought the assets of Interactive Supercomputing and Intel bought RapidMind and Cilk Arts. So the major IT companies are buying up this parallel processing expertise.

Conclusion

Multicore causes mainstream IT a problem in that most applications will struggle to scale as fast as new multicore systems do, and most programmers are not parallel processing specialists. And this problem is magnified many times over when running HPC applications in the cloud, since even if the programmer and the compilers being used could do a perfect job of optimizing and parallelizing an application, the detail target architecture is unknown.

Is there a solution? In the long term new programming paradigms or languages are required, perhaps with a two-stage compilation process that compiles to an intermediate language but postpones the final optimization and code generation until the target system is known. And no, I don’t think Java is the answer.

]]>https://www.hpcwire.com/2009/11/02/grid_computing_done_right/feed/05635Grids or Clouds for HPC?https://www.hpcwire.com/2009/11/02/grids_or_clouds_for_hpc/?utm_source=rss&utm_medium=rss&utm_campaign=grids_or_clouds_for_hpc
https://www.hpcwire.com/2009/11/02/grids_or_clouds_for_hpc/#respondMon, 02 Nov 2009 08:00:00 +0000http://www.hpcwire.com/?p=5650Time and again, people ask questions like "Will HPC move to the cloud?" or "Now that cloud computing is accepted, are grids dead?" or even "Should I now build my grid in the cloud?" Despite all the promising developments in the grid and cloud computing space, and the avalanche of publications and talks on this subject, many people still seem to be confused and hesitant to take the next step.

]]>Time and again, people ask questions like “Will HPC move to the cloud?” or “Now that cloud computing is accepted, are grids dead?” or even “Should I now build my grid in the cloud?” Despite all the promising developments in the grid and cloud computing space, and the avalanche of publications and talks on this subject, many people still seem to be confused and hesitant to take the next step. I think there a number of issues driving this uncertainty.

Grids didn’t keep all their promises

Grids did not evolve (as some of us originally thought) into the next fundamental IT infrastructure for everything and for everybody. Because of the diversity of computing environments we had to develop different middleware stacks (department, enterprise, global, compute, data, sensors, instruments, etc.), and had to face different usage models with different benefits. Enterprise grids were (and are) providing better resource utilization and business flexibility, while global grids are best suited for complex R&D application collaboration with resource sharing. For enterprise usage, setting up and operating grids was often complicated. For researchers this characteristic was seen to be a necessary evil. Implementing complex applications on HPC systems has never been easy. So what.

Grid: the way station to the cloud

After 40 years of dealing with HPC, grid computing was indeed the next big thing for the grand challenge, big-science researcher, while for the enterprise CIO, the grid was a way station on its way to the cloud model. For the enterprise today, clouds are providing all the missing pieces: easy to use, economies of scale, business elasticity up and down, and pay-as you go (thus getting rid of some CapEx). And in cases where security matters, there is always the private cloud. In more complex enterprise environments, with applications running under different policies, private clouds can easily connect to public clouds — and vice versa — into a hybrid cloud infrastructure, to balance security with efficiency.

Different policies, what does that mean?

No application job is alike. Jobs differ by priority, strategic importance, deadline, budget, IP and licenses. In addition, the nature of the code often necessitates a specific computer architecture, operating system, memory, and other resources. These important differentiating factors strongly influence where and when a job is running. For any new type of job, a set of specific requirements decide on the set of specific policies that have to be defined and programmed, such that any of these jobs will run just according to these policies. Ideally, this is guaranteed by a dynamic resource broker that controls submission to grid or cloud resources, be they local or global, private or public.

Grids or clouds?

One important question is still open: how do I find out, and then tell the resource broker, whether my application should run on the grid or in the cloud? The answer, among others, depends on the algorithmic structure of the compute-intensive part of the program, which might be intolerant of high latency and low bandwidth. This has been observed with benchmark results. The performance limitations are exhibited mainly by parallel applications with tightly-coupled, data-intensive inter-process communication, running on hundreds or even thousands of processors or cores.

The good news is, however, that many HPC applications do not require high bandwidth and low latency. Examples are parameter studies often seen in science and engineering, with one and the same application executed for many parameters, resulting in many independent jobs, such as analyzing the data from a particle physics collider, identifying the solution parameter in optimization, ensemble runs to quantify climate model uncertainties, identifying potential drug targets via screening a database of ligand structures, studying economic model sensitivity to parameters, and analyzing different materials and their resistance in crash tests, to name just a few.

A Grid in the cloud

One great example of a project that has built a grid in the cloud is Gaia, a European Space Agency funded mission which aims to monitor one billion stars. Amazon Machine Images (AMIs) were configured for the Oracle database grid and processing software (AGIS). The result is an Oracle grid running inside the Amazon Elastic Compute Cloud (EC2). To process five years of data for 2 million stars, 24 iterations of 100 minutes each translates into 40 hours of 20 EC2 CPU instances. Benefits include reduced costs (50 percent compared to the in-house solution) and massive scalability on demand without having to invest in new infrastructure or train new personnel. And only a single line of source code was changed in order to get it to run in the cloud.

HPC needs grids and clouds

According to the DEISA Extreme Computing Initiative (DECI), there are still plenty of grand challenge science and engineering applications that can only run effectively on the largest and most expensive supercomputers. In DEISA, a European HPC grid, also called the HPC Ecosystem, is made up of 11-teraflops nodes.

Today, nobody would build an HPC cloud for these particular applications. It simply wouldn’t be a profitable business, the “market” (i.e., the HPC users) is far too small and thus lacks economy of scale. In some specific science application scenarios, with complex workflows of different tasks (nodes), a hybrid infrastructure might make sense: cloud capacity resources and HPC capability nodes, providing the best of both worlds.

About the Author

Wolfgang Gentzsch is Dissemination Advisor for the DEISA Distributed European Initiative for Supercomputing Applications. He is an adjunct professor of computer science at Duke University in Durham, and a visiting scientist at RENCI Renaissance Computing Institute at UNC Chapel Hill, both in North Carolina. From 2005 to 2007, he was the Chairman of the German D-Grid Initiative. Recently, he was Vice Chair of the e-Infrastructure Reflection Group e-IRG; Area Director of Major Grid Projects of the OGF Open Grid Forum Steering Group; and he is a member of the US President’s Council of Advisors for Science and Technology (PCAST-NIT).