home

Cloud-TM is a highly innovative data-centric middleware platform aimed at facilitating development and abating operational and administration costs of cloud applications.

Designed from the grounds up to meet the scalability and dynamicity requirements of cloud infrastructures, Cloud-TM provides intuitive, yet powerful abstractions aimed at masking complexity and at allowing ordinary programmers to unleash the potentiality of large-scale cloud platforms.

Further, Cloud-TM integrates pervasive self-tuning schemes, which exploit in a synergic way diverse methodologies like analytical modelling, simulation and machine learning, to pursue optimal efficiency at any scale, and for any workload.

The Challenge

The appearance of the first commercial Cloud Computing platforms has
represented a significant step towards the materialization of the vision
of utility-computing.

However, the promise of infinite scalability
catalyzing much of the recent interest about Cloud Computing is still
menaced by one major pitfall: the lack of programming paradigms and
abstractions capable of bringing the power of distributed programming into
the hands of ordinary programmers, sheltering from the complexity of developing systems deployed over large scale, elastic cloud platforms.

A crucial
issue that we have tackled in the Cloud-TM project has been developing innovative mechanisms and abstractions aimed at ensuring adequate consistency levels while
being:

1.simple and
familiar for the programmers

2. highly
efficient and scalable

3.fault-tolerant and highly
available.

Decades
of research and field experience in this area have brought to the development
of a plethora of different approaches to ensure state consistency in
distributed platforms, and taught a fundamental, general lesson. No
universal, one-size-fits-all solution exists, as the efficiency of
individual state management approaches is strongly affected by both:

1.the
characteristics of the incoming workload, such as the ratio of read/write
operations, as well as the spatial/temporal locality in the data access
patterns, and

2.the scale of
the system (e.g. low vs high number of nodes, local vs geographical
distribution) on which these mechanisms are deployed.

The complexity of
this problem is hence particularly exacerbated in cloud computing platforms
due to the feature that is regarded as one of the key advantages of the cloud:
its ability to elastically acquire or release resources, varying the scale of the platform in real-time to meet the demands of varying
workloads.

The Cloud-TM approach

The
Cloud-TM project addressed these issues by building a highly innovative data-centric middleware
platform. The Cloud-TM platform is designed from the grounds up to meet the
scalability and dynamicityrequirements of cloud infrastructures,
while providing intuitive, yet powerful abstractions aimed at
masking complexity and allowing ordinary programmers to unleash
the potentiality of large-scale Cloud platforms.

Most cloud computing
infrastructures embrace weak consistency
models that achieve scalability at the cost of an increase of complexity for the programmers. This leads to a
significant growth of software development
costs and of the time to market, ultimately hindering competitiveness.

Conversely, Cloud-TM
adopts an intuitive, yet scalable programming paradigm. The Cloud-TM programming paradigm integrates the friendly abstraction of
atomic transaction as a first-class programming construct, sheltering
programmers from having to deal with the idiosyncrasies of weak consistency
models. Strong-consistency and scalability, two properties
often seen as antagonists, are reconciled thanks to innovative transactional consistency schemes designed precisely to meet the scalability and elasticity requirements of typical cloud infrastructures

Beyond transactional
consistency, the Cloud-TM programming model provides transparent support
for object orientation and queries, concurrency-friendly data
structures and frameworks to control distributed execution of tasks,
hiding issues such as fault-tolerance, load distribution and data placement.

Finally, Cloud-TM's pursues the minimization of the other major source of
costs for cloud-based applications, namely operational costs, in a twofold way:

Automating the provisioning of
resources from
the cloud based on user specified target criteria in terms of both Quality of Service and budget constraints. This
allows guaranteeing that applications only use the minimum amount of necessary resources to withstand the current load
pressure, minimizing both administration
and operational costs.

Maximizing efficiency (i.e. the costs/benefits ratio
in the Cloud Computing usage-based pricing model) via pervasive self-tuning schemes that
adapt the middleware's internals to ensure optimal performance at any scale, and for any workload. This means making the most effective use of the
currently allocated resources, leading to a reduction of the amount of required
resources, and, consequently, of the operational costs.

Overview of the Cloud-TM Platform

The Cloud-TM Platform high level architecture is depicted in the following figure. It is formed by two main parts: the Data Platform and the Autonomic Manager.

Data Platform. The Data Platform is responsible for storing, retrieving and manipulating data across a dynamic set of distributed nodes, elastically acquired from the underlying IaaS Cloud provider(s).

The Data Platform Programming APIs have been designed to simplify the development of large scale data centric applications deployed on cloud infrastructure. They include the Object Grid Mapper, the Search API and the Distributed Execution Framework.

To this end, the programmatic interfaces
offered by the Cloud-TM Data Platform allow to:

store/query data into/from
the Data Platform using the familiar and convenient abstractions provided by
the object-oriented paradigm, such as inheritance, polymorphism, associations;

take full advantage of the
processing power of the Cloud-TM Platform via a set of simple abstractions that
hide the complexity associated with parallel/distributed programming, such as
thread synchronization and scheduling, and fault-tolerance;

Lower in the stack we find a highly scalable,
adaptive In-memory Distributed Transactional Key-Value Store/Distributed Transactional
Memory(DTM), which represents the backbone of the
Cloud-TM Data Platform. In order to maximize the visibility, impact and future
exploitation of the results of the Cloud-TM project, the consortium agreed to
use Red Hat's Infinispanas the starting point for developing this essential
component of the Cloud-TM Platform. Throughout the project Infinispan has been
extended with innovative data management algorithms (in particular for what
concerns data replication and distribution aspects), as well as with real-time
self-reconfiguration schemes aimed at guaranteeing optimal performance even in
highly dynamic cloud environments.

Autonomic Manager. The Autonomic Manager is the component in charge of the self-tuning of the Data Platform. In the Cloud-TM Platform, self-optimization is a pervasive property that is pursued across multiple layers of the platform.

Specifically, the Cloud-TM Platform leverages
on a number of complementary self-tuning mechanisms that aim to automatically optimize, on the basis of user specified Quality of Service (QoS) levels and
cost constraints, the following functionalities/parameters:

the scale of the underlying
platform, i.e, the number and type of machines over which the Data Platform is
deployed;

the data replication
degree, i.e. number of replicas of each datum stored in the platform;

the transactional data consistencyprotocol;

the data placement
strategies and request distribution policies, with the ultimate goal of
maximizing the data access locality of Cloud-TM applications.

The following figure illustrates an example scenario highlighting the self-optimizing
capabilities of the Cloud-TM platform. Depending on the current workload
characteristics, Cloud-TM can autonomously acquire or release resources from
the Cloud, and adjust, in a transparent manner, its internal consistency
mechanisms to maximize performance and efficiency.

Videos

The YouTube channel of the project contains several videos demonstrating a number of features of the Cloud-TM platform.

Since the early stages of
the project, academic partners have worked in close collaboration with the
leading company in the open-source software arena, Red Hat. This has allowed to integrate a number of innovative
solutions in highly visibleopen source projects, likeInfinispan, JGroups, Hibernate Search andHibernate OGM.

The choice of embracing open source, and the
integration of the best-of-breed research results in popular Red Hat projects,
have strongly amplified the impact and
visibility of the project's achievements, and paved the way for their
immediate industrial exploitation.

The choice of open source
means also that the Cloud-TM platform is freely available for the broad community of SMEs that find in
cloud computing a highly attractive model, not only from the economic
perspective (thanks to its advantageous pay-only-for-what-you-use billing
scheme), but also due to its simplicity and scalability.

This work was generated in the framework of the Specific Targeted Research Project (STReP) Cloud-TM, which is co-financed by the European Commission through the contract no. 257784.

Duration:

From June 2010 to May 2013

Key Goals:

1. Transaction-centric programming paradigm

Defining a friendly programming model for large scale distributed
applications that integrates the familiar notion of atomic transaction
as a first-class programming language construct. This would spare
programmers from the burden of implementing low level, error-prone
mechanisms (e.g. distribution, persistence and fault-tolerance)
attaining major reductions of the development process cost

2. Minimizing Costs

Minimizing the monitoring and administration costs by automating the
provisioning of resources from the cloud based on user specified target
criteria in terms of both Quality of Service and budget

3. Maximizing Scalability

Maximizing the scalability and efficiency (i.e. the costs/benefits ratio
in the Cloud Computing usage-based pricing model) of the user-level
services by self-tuning the middleware's internal mechanisms to ensure
optimal performance in face of fluctuations of the number of allocated
resources and of the workload characteristic