Understand cloud SLAs to protect critical biz processes

Organizations should understand and only sign up for cloud services that meet their service level agreement requirements, while enhancing business recovery mechanisms at the same time, industry watchers say.

Businesses need to take a closer look at the service level agreements signed with their cloud providers before inking a contract, industry watchers advise, particularly after the recent outage of Amazon Web Services' (AWS) Elastic Compute Cloud (EC2) which impacted a number of high-profile customers.

Chris Morris, research director for cloud technologies and services at IDC Asia-Pacific, said unscheduled outages are "a fact of life" for both cloud and non-cloud systems. The lesson that needs to be learnt from the AWS outage is that immunity from system downtimes cannot be guaranteed by outsourcing that responsibility to a third-party vendor, he added.

"Any IT professional should always have contingency and continuity plans in place to ensure the ongoing operation of critical systems and not assume it is being looked after by [one's service provider]," Morris said in an e-mail interview.

In the EC2 outage, AWS had actually fulfilled its SLA requirements with its customers, he added, noting that the lack of understanding over responsibilities outlined in the SLA was in fact the root cause of disruption suffered by the affected companies.

Lydia Leong, research vice president at Gartner, elaborated on how understanding SLAs in this instance, could have helped AWS' affected clients better provision for such downtimes. She wrote in her blog post that Amazon's SLA for EC2 is 99.95 percent for multi-availability-zone deployments, which meant customers should expect about 4.5 hours of total region downtime each year without the cloud provider violating the SLA.

Furthermore, Leong said the SLA defines unavailability as "a lack of external connectivity to EC2 instances, coupled with the inability to provision working instances".

"In this case, EC2 was just fine by that definition. It was [Amazon's Elastic Block Service (EBS) and Relational Database Service] that did not meet the SLA, and neither of these services have SLAs," she noted.

The Apr. 21 service outage had triggered a large amount of re-mirroring of EBS volumes in Amazon's northern Virginia site, created a shortage of capacity in this availability zone. This shortage, in turn, impacted new EBS volume creation as well as the pace with which AWS could re-mirror and recover affected EBS volumes, Amazon revealed in an earlier report.

Morris urged companies, particularly small and midsize businesses (SMBs), to understand what is and is not covered by the contract, as part of their pre-contract due diligence process.

"While cloud vendors are upgrading cloud services and associated SLAs, their customers will also have to pay close attention to the contract they have signed," the IDC analyst said. "Unfortunately, many of the early cloud users are in the SMB category with little or no understanding of proper due diligence processes."

Rackspace Hosting's Asia-Pacific managing director, Jim Fagan, concurred. He told ZDNet Asia in an e-mail that every cloud provider offers differing levels of redundancy, reliability and SLAs, and businesses should first conduct their own due diligence to determine if the provider is living up to its promises and offering SLAs to support its guarantees.

Companies should also weigh whether the offered SLAs are acceptable based on their business needs, Fagan added.

With regard to the AWS outage, he noted that while the incident highlighted risks associated with cloud computing, it also showed examples of how cloud systems actually "created a more redundant and stable platform" that is difficult to achieve in a traditional datacenter environment.

"There were many companies that utilized the elasticity and cost advantages of cloud computing to architect a robust disaster recovery plan which allowed them not to be affected by the outage," he said.

Deploying business backupsMeanwhile, Akihiro Okada, president of Fujitsu's cloud business support unit, said infrastructure-as-a-service (IaaS) vendors ought to have the "same level of design, construction and maintenance in highly-reliable cloud platforms that are on par with standard, on-premise data centers".

Additionally, there should be backup systems in centers located in different regions to offer operational continuity, Okada said in an e-mail interview. Itself an IaaS vendor, Fujitsu currently hosts its cloud services via six locations worldwide, he noted.

"Prior to concluding a contract, we make sure that customers understand planned outages, failure reports, support conditions and SLAs, among other issues," he shared. "Fujitsu conveys to its sales teams and customers that contracts are concluded only upon having achieved a clear understanding of conditions."

Apart from operational continuity, multi-location hosting would also help solve the challenge of data sovereignty, said Andrew Milroy, vice president of ICT research in Asia-Pacific, Frost & Sullivan.

He said many cloud providers only have data centers in the United States, and this means their customers' stored data is subject to that country's laws. This, Milroy explained, would be an issue for governments or financial institutions that might be required to keep data within the country and not be able to use a cloud provider that only has data centers outside the country.