Cloud computing is the most recent instantiation of utility computing under service-oriented architecture (SOA). Cloud computing provides computing resources on demand and enables elastic sharing of computing resources as plug and play services to cloud service consumers, cloud partners, and cloud vendors in the cloud value chain. Users only pay for the volume and time of the resources being used, and this pay-as-you-go model attracts both business and individual users. The resource sharing in cloud computing can occur at various levels of abstraction, resulting in numerous cloud offerings, such as Infrastructure as a Service, Platform as a Service, and Software as a Service. Clouds have been made possible by building on top of services and SOA, hardware and software virtualization, web technology, and standards. Current cloud research is focused on the interactions between these underlying technologies and how to provide a high quality of service (QoS).

This special issue on cloud computing originated from the very successful CLOUD 2010 conference that was held in Miami, Florida in July 2010. The theme of this conference was “Change We Are Leading,” aimed at the state-of-the-art technology advances made to cloud infrastructure, and various active research areas including cloud security, cloud reliability, and cloud service discovery. Submitted papers underwent a thorough selection process where every submission was reviewed by at least three members of the program committee and only 20 percent of the papers were selected for the presentation and inclusion in the conference proceedings.

After the conference, the authors of the top 30 percent of the conference papers were invited to submit an extended journal version for the IEEE Transactions on Services Computing Special Issue on Cloud Computing. The extended submission was required to have significant expansion and enhancement to its conference paper version in terms of content, scope, and quality. A second peer review process was conducted on every submitted journal paper. We selected eight high quality papers to be included in this special issue. These selected papers highlight the theme of the conference, “Change We Are Leading,” with research breakthroughs that are innovative and significant.

The first paper, entitled “A Trusted Virtual Machine in an Untrusted Management Environment” by Li et al. deals with virtualization and the influence of a virtual machine (VM) on security. On one hand, virtualization is a technology in which provisioning can benefit computing systems by improving resource utilization, increasing software portability, and reliability. Moreover, it may even enhance security by providing isolated execution environments for different applications that require different levels of security. On the other hand, hypervisors could be a place to mount an attack against those environments supported by this hypervisor. Therefore, for security-critical applications, it is highly desirable to have a small trusted computing base (TCB) to minimize the surface of attacks. For many applications, it is not acceptable to trust an OS because its surface of attack can be huge. The authors of this paper propose a secure virtualization architecture that provides a secure run-time environment, network interface, and secondary storage for a guest VM. An analysis shows that the proposed architecture significantly reduces the TCB of security-critical guest VMs, which in turn improves security in an untrusted management environment. A prototype of the proposed approach using the Xen virtualization system was built, and the authors demonstrate how it can be used to facilitate secure remote computing services, while execution performance is only affected slightly.

In the second paper, “VNsnap: Taking Snapshots of Virtual Networked Infrastructures in the Cloud,” the authors, Kangarlou et al., address issues in a virtual networked infrastructure (VNI) where VMs are connected by a virtual network, and from there, it leads to the realization of the concept of “Infrastructure as a Service” (IaaS). It is a critical feature that a VNI checkpoint can be used to restore the operation of the entire virtual infrastructure. The authors present VNsnap, a system that takes the distributed checkpoint of VNIs. The basic advantage of their proposed solution is that VNsnap 1) does not require any modification to the applications, libraries, or (guest) operating systems running in the VMs, and 2) only incurs seconds of overhead. When running VNsnap on top of Xen, the authors demonstrate that VNsnap is effective and efficient when executing real-world parallel and distributed applications.

In the third paper, “Resource Provisioning with Budget Constraints for Adaptive Applications in Cloud Environments,” Zhu et al. claim that while making the vision of utility computing realizable, clouds create new resource provisioning problems by their own. Their substantiation of this claim is that, according to “the pay-as-you-go model, resource provisioning should be performed in a way to keep resource costs to a minimum, while meeting an application's needs.” The authors focus on 1) the use of cloud resources for a class of adaptive applications with specific desired flexibility, and 2) the possibility to meet a fixed time limit and a resource budget. The authors identify a need for maximized QoS of such applications, or more precisely, the value of an application-specific benefit function, by dynamically changing adaptive parameters. The paper contains the design, implementation, and evaluations of a framework that can support such dynamic adaptation for applications in a cloud.

As Yi et al. state in their paper entitled “Monetary Cost-Aware Checkpointing and Migration on Amazon Cloud Spot Instances,” Amazon provided a list of the EC2 service provision costs, which benefits ordinary users greatly. Furthermore, it recently introduced spot instances in EC2 that offer low resource costs in exchange for reduced reliability. It is interesting that these instances can be revoked on demand due to price and demand fluctuations, which is of great value for users seeking to lessen their costs while maintaining high reliability. The authors present in their paper the results of a study on how checkpointing and migration can be used to minimize the cost and volatility of the resource provisioning. They compare several adaptive checkpointing schemes in terms of the monetary costs and the improvement of job completion times based on the real price history of EC2 spot instances. Furthermore, the authors evaluate schemes that apply predictive methods for spot prices, and how work migration can improve task completion in the midst of failures while maintaining low monetary costs. Their simulation experiments show that the proposed schemes can reduce both monetary costs and task completion times of computation on spot instance.

The basis of the paper entitled “CloudTPS: Scalable Transactions for Web Applications in the Cloud” by Zhou et al. is a claim that NoSQL Cloud data stores is a well known concept that the provisioning of scalability and high availability to web applications is paid by data consistency. It is known that many applications cannot afford any data inconsistency. The CloudTPS approach is a scalable transaction manager which guarantees full ACID properties for multi-item transactions issued by web applications, even in the presence of server failures and network partitions. The authors present the implementation of this approach on top of the two main families of scalable data layers, Bigtable and SimpleDB, and carry out the performance evaluation. This evaluation is carried out on top of HBase (an open-source version of Bigtable) in the authors' local cluster and Amazon SimpleDB. It shows that the implemented system scales linearly at least up to 40 nodes in the local cluster and 80 nodes in the Amazon cloud.

Th next paper, entitled “Component Ranking for Fault-Tolerant Cloud Applications,” is by Zheng et al. The authors state that, on one hand, building highly reliable cloud applications is a challenging and critical research problem; on the other hand, one must consider the costs of such an endeavour. In this paper, the authors address its research aspect, and propose a component ranking framework, named FTCloud, for building fault-tolerant cloud applications. Their FTCloud framework includes two ranking algorithms: one that employs component invocation structures and invocation frequencies for making component ranking, and one that systematically fuses the system structure information as well as the application designers' wisdom to identify the components in a cloud application. According to their framework, after the component ranking phase, an algorithm is proposed to automatically determine an optimal fault tolerance strategy for cloud components. Their experimental results show that, by tolerating faults of a small part of the most significant components, the reliability of cloud applications can be greatly improved.

Chard et al., in their paper “Social Cloud Computing: A Vision for Socially Motivated Resources Sharing,” address two associated concepts, clouds and trust. As the reader is well aware, the level of trust between users in social networks can be determined by online relationships. However, there is a problem of setting up a level of trust for users who share resources. In response to this problem, the authors propose leveraging users' relationships to form a dynamic “Social Cloud,” which enables users to share heterogeneous resources within the context of a social network. The authors also propose exploiting the inherent socially corrective mechanisms such as incentives and disincentives to enable long term sharing with lower privacy concerns and security overheads that are present in traditional cloud environments. To use the proposed solution, the authors propose a social marketplace as a means of regulating sharing. The paper contains a definition of social cloud computing, outlines various aspects of social clouds, and demonstrates the approach using a social storage cloud implementation in Facebook.

According to Sim in his paper “Agent-Based Cloud Computing,” cloud service publication, discovery, service negotiation, and service composition are one of the most critical issues in service provisioning and consumption. In response to this problem, the author presents the design and development of software agents for bolstering cloud service discovery, service negotiation, and service composition. The significance of this work is in the introduction of an agent-based paradigm for constructing software tools and test beds for cloud resource management. There are some novel contributions presented in this paper: 1) Cloudle: an agent-based search engine developed for cloud service discovery, 2) a demonstration that agent-based negotiation mechanisms can be adopted for bolstering cloud service negotiation and cloud commerce, and 3) a demonstration that agent-based cooperative problem-solving techniques can be effectively adopted for automating cloud service composition. The paper shows the architecture and implementation of the proposed solution. Empirical results show that, using the proposed system, agents achieved high utilities and high success rates in negotiating and composing cloud resources.

Cloud computing as a multidisciplinary field can benefit from careful integration and exploitation of advances in computer architecture, operating systems, distributed computing, network computing, service computing and service science, information systems, knowledge discovery, knowledge modeling and engineering, social sciences, and economics. With the fast growth of data-center technologies, system virtualization technologies, and research and development for making software as a service, infrastructure as a service, and platform as a service, we envision that cloud computing will continue to penetrate our business and social life as well as transform the service-oriented computing paradigm to the next level of innovation. We believe that the selected papers in this special issue represent the right step toward the motto of our special issue, “Change We Are Leading,“ providing a comprehensive snapshot of the fast moving research in cloud computing that may influence many other ongoing research efforts. We trust that you will enjoy reading it.

This special issue would not be possible without help provided by many. We would like to thank all the authors for their contributions, and all reviewers for their effort and dedication to this special issue. Furthermore, we want to thank for the editorial board of the IEEE Transactions on Service Computing, the Editor-in-Chief, Liang-Jie Zhang, and the IEEE Computer Society staff Joyce Arnold and Kimberly Sperka for their editorial support.

For information on obtaining reprints of this article, please send e-mail to: tsc@computer.org.

Andrzej Goscinski is a professor in the School of Information Technology at Deakin University. His research interests include cloud computing, service computing, distributed systems, resource management, and making deployment and access of HPC applications on clouds easy.

Wu Chou is vice president, chief IT scientist, and head of the Huawei Shannon (IT) Lab, Huawei Technologies (USA). His research interests include cloud computing, web, Internet, data networking, web services, unified communication and collaboration, smart systems, IT platforms, and information systems. He is a fellow of the IEEE.

Ling Liu is a professor in the College of Computing at the Georgia Institute of Technology. Her research interests include cloud computing, Internet computing, distributed systems, data management, data mining, security, and privacy.