March 08, 2011

Productivity vs. Control tradeoffs in PaaS

The paper does a pretty good job covering the evolution that is taking place in the PaaS market toward a more open platform and compares between the two main categories: aPaaS (essentially a PaaS running as a service) and CEAP (Cloud Enabled Application Platform) which is the *P* out of PaaS that gives you the platform to build your own PaaS in private or public cloud.

According to Gartner the main split between the two categories is Productivity vs Control:

The cloud application platform markets are splitting to support two different constituencies: mainstream application developments that are focused on fast time to deployment, and advanced projects requiring the full control of the underlying cloud application platform attributes.

The paper also provides useful takeaways that follow that line of thought:

The aPaaS market is now dominated by high-productivity offerings targeting mainstream, often opportunistically oriented, rapid application deployments, at times implemented by "citizen developers" without formal IT department approval.

High-control offerings are emerging to support the most-advanced, systematically oriented requirements. These projects often look primarily for a CEAP, rather than an aPaaS, to enable maximum control over the technology environment.

My take:

While I was reading through the paper I felt that something continued to bother me with this definition, even though I tend to agree with the overall observation. If I follow the logic of this paper than I have to give away productivity to gain control, hmm… that’s a hard choice.

The issue seem to be with the way we define productivity. Let me explain.

Productivity – redefined

The term "productivity" is pretty wide. Even after a short research on the subject it became clear to me that even today were lacking an agreed-upon model for measuring Programming productivity as noted in the wikipedia definition:

A generally accepted working definition of programmer productivity needs to be established and agreed upon. Appropriate metrics need to established. Productivity needs to be viewed over the lifetime of code. Example: Programmer A writes code in a shorter interval than programmer B but programmer A's code is of lower quality and months later requires additional effort to match the quality of programmer B's code; in such a case, it is fair to claim that programmer B was actually more productive.

During my short research I came across an interesting study on the subject by Google. I thought that the study went a long way toward defining productivity. It does so by taking a a specific challenge (HPC in this specific case) and measured the actual progress students made on developing a certain task at different stages using various technologies. For example, it measured how much time was spent to develop a certain algorithm, the time it took to debug it, optimize it and get it working. One of the interesting outcomes of this study that seems to be generally applicable to many other technologies is that we often spend more time debugging and optimizing code than on the actual development. It was also found that the biggest bang for the buck is gained by shortening the transition between the various development cycles and less by reducing the time of each individual cycle. (I’ll get back to that later on in this post.)

Which platform is more productive?

On a question comparing the RoR and Garils platforms on stackoverflow (Is developer productivity higher on Ruby on Rails or Grails?) it became clear that answer to the question is always subjective as it depends on your existing skillset, legacy, framework etc. In other words, there is no clear winner and the answer can be different on a per case basis. This brings me to the point that its going be hard to claim that one platform is more productive than the other without taking the specific context in mind.

Also see Joseph Ottinger's post, "It's not about C++ and Java performance," which addresses a series of benchmarks people were trying to use to assert than Java was better than C++ and vice versa.

Defining Platform-Productivity

The definition that seem to resonate best for me was provided by Can Wolf on one of his comments on stackoverflow thread above:

Productivity is measured by units of features being delivered (not lines of code).

Can Wolf's comment in brackets is even more interesting. In many of the recent language debate centered around productivity we seem to be zeroing on the number of lines of code required to deliver a given functionality as the main measure for productivity. As I noted in the opening of this section reducing the number of lines of code covers only a narrow aspect of productivity - and not necessarily the best one.

Measuring the the time it takes to bring new feature from development into production is where I believe we should put the focus when we measure productivity as this is what really matters – by looking at that measure we may find that language choice has actually only a marginal contribution on our overall productivity in comparison to how well we can speed up the time it takes to go through the QA and Production cycles. We may also find that the right choice would actually be to combine different languages for different purposes.

In a case of a PaaS platform that means that a platform that makes it extremely easy to write your own hello-world application wouldn’t be considered productive if it fails to take you effectively through the entire debugging and optimization cycles into production.

Productivity vs. Control tradeoffs in PaaS

I started this post by quoting Gartner on the choices between Productivity and Control when looking into the two main categories in PaaS market.

I thought that Carlos Ble's post Goodbye Google App Engine (GAE) is a good example that illustrate why the initial perception behind GAE as simple platform that provides extreme productivity can be completely wrong.

..developing on GAE introduced such a design complexity that working around it pushes us 5 months behind schedule.

Part of the reason that brought Carlos through that experience IMO is that in the course of trying to make GAE extremely productive the owner made the platform too opinionatedto the point where you lose all the potential productivity gains by trying to adopt their model. In addition to that, with a platform like GAE you have very little freedom to leverage existing frameworks such as your own database, or messaging system, or any other third party service that can by itself be a huge contributor to productivity.

Instead, you're completely dependent on the platform provider's stack and pace of development and that in itself can work against agility and productivity in yet another dimension. In this specific example Carlos couldn’t use a certain version of a Python library which would have made his productivity higher and had to work around issues that were already solved elsewhere. This is a good example how the lack of flexibility lead to poorer productivity even in the case of simple applications.

Amazon PaaS (BeanStalk) vs Google App Engine – Productivity + Control

It's interesting to compare Google App Engine with Amazon Beanstalk. Amazon provides full control over almost every piece of their platform. You could choose your own operating system and integrate with external database and other services of your choice, etc. At the same time, Amazon provide layers of abstraction on top of their infrastructure through services like RDS, SimpleDB, SQS, MapReduce, and Beanstalk. In addition, to that they have an network of ecosystem PaaS providers such as Heroku, GigaSpaces and others that run on top of their IaaS services and complement their offering.

Open productivity

Amazon's approach is to provide what I would call Open Productivity. It enables you to start with extreme productivity where you would give up some degree of control at the earlier stages. Unlike GAE you're not locked into any of those layers of abstraction and you always have the choice to go another level down and pick your own database of choice, OS framework, etc.

Amazon's approach is pretty much aligned with the approach that we took in GigaSpaces in 2009 when we developed our first generation PaaS, as I noted in this post Google App Engine plus Amazon AWS: Best of both worlds. In other words, you don’t always have to trade extreme productivity for control. If the platform architecture is layered correctly it is possible to get a good degree of productivity and control as in the case of Amazon and GigaSpaces. Choosing between the degrees of productivity and control becomes a decision of which layer of abstraction to start with and not necessarily which platform to choose from.

Doing it the other way around - i.e. providing closed productivity first - without a good degree of control could lead to high productivity at the beginning but end up with low productivity as the project evolves as noted in Carlos' experience. In addition to that it is often much harder to take a closed productivity platform and open its control levels at a later stage if the platform was not designed for it from the start.

The caveat though is that it often takes more time to start with an open platform and build the layer of abstraction on top. In my view, this leads to a better evolution as you grow with your users and follow their demand rather than the other way around.

Bottom line – It’s not about productivity

After working on this short analysis, I came to the conclusion that since productivity is a broad term it would be wrong to think that our choice is to trade productivity with control when we select aPaaS or CEAP platforms.

To put it simply the main difference between the current class of aPaaS and CEAP is that most aPaaS are targeted toward the long-tail of web applications where CEAP is targeted to the more high end part of the spectrum.

All of the platform would try to provide extreme productivity for the type of applications that they are targeting.

A better way to make a platform choice would be based on the target applications that each platform is aimed for. For example I would look at GAE for extremely small Java or Payhon applications (widgets) or prototype, Heroku for small Ruby based App, Force for an application that needs to integrate more tightly with the salesforce ecosystem.

I would look for Beanstalk if I’m already running a simple java application and don’t plan to switch off of Amazon anytime soon. I would choose CEAP such as GigaSpaces in cases where I don’t want to be bound to a particular cloud provider and want the flexibility to run the same application on variety of environments private or public with or even without virtualization, In Java or .Net. The later would fit into most enterprises and the more high end SaaS providers where the first would probably more into end users applications.

Extensibility & Flexibility

One of the attributes of control is extensibility and flexibility. Extensibility and flexibility is another factor worth considering when selecting a platform. As it stands today, the various platforms tend to be quite different in the level of choice that they offer in language, frameworks, and the ability to plug-in your service of choice or using only parts of the platform (database, messaging, ..). As noted earlier in the GAE example, flexibility can actually lead to greater productivity.

What’s next.. toward a 2ndG PaaS

If we follow the evolution of aPaaS and CEAP solutions, we can see a consistent shift toward more open platforms than the one that was available in the first generation PaaS. With the emergence of more open infrastructures in the form of OpenStack, as well as the evolution of enterprise data centers toward hybrid clouds, we can expect even higher degree of flexibility not just on Amazon but on any cloud (private or public). The other thing that comes up often in many of the recent DevOps discussions is the demand to enable similar level of productivity and automation that Amazon (or GigaSpaces for that matter) provides for its own built-in services to external services of my choice.

Gartner's paper does a petty good job covering that shift in the PaaS market but the use of productivity as the differentiating factors can be misleading as I outlined in this paper.

The good news is that they were expected to get much better selection of choices between the various tradeoffs around productivity and control. I will dive into the details of that in a followup post.

PaaS trend discussion @ Cloud Connect

In this post I tried to lay out my own analysis on this matter. I’m very interested in other people experience on this subject so If this topic is of interest to you and you happened to be in CloudConnect next week I’d be happy if you would drop me an email (natis at gigaspaces dot com) or simply meet me at our CloudConnect booth (106).

Comments

The paper does a pretty good job covering the evolution that is taking place in the PaaS market toward a more open platform and compares between the two main categories: aPaaS (essentially a PaaS running as a service) and CEAP (Cloud Enabled Application Platform) which is the *P* out of PaaS that gives you the platform to build your own PaaS in private or public cloud.

According to Gartner the main split between the two categories is Productivity vs Control:

The cloud application platform markets are splitting to support two different constituencies: mainstream application developments that are focused on fast time to deployment, and advanced projects requiring the full control of the underlying cloud application platform attributes.

The paper also provides useful takeaways that follow that line of thought:

The aPaaS market is now dominated by high-productivity offerings targeting mainstream, often opportunistically oriented, rapid application deployments, at times implemented by "citizen developers" without formal IT department approval.

High-control offerings are emerging to support the most-advanced, systematically oriented requirements. These projects often look primarily for a CEAP, rather than an aPaaS, to enable maximum control over the technology environment.

My take:

While I was reading through the paper I felt that something continued to bother me with this definition, even though I tend to agree with the overall observation. If I follow the logic of this paper than I have to give away productivity to gain control, hmm… that’s a hard choice.

The issue seem to be with the way we define productivity. Let me explain.

Productivity – redefined

The term "productivity" is pretty wide. Even after a short research on the subject it became clear to me that even today were lacking an agreed-upon model for measuring Programming productivity as noted in the wikipedia definition:

A generally accepted working definition of programmer productivity needs to be established and agreed upon. Appropriate metrics need to established. Productivity needs to be viewed over the lifetime of code. Example: Programmer A writes code in a shorter interval than programmer B but programmer A's code is of lower quality and months later requires additional effort to match the quality of programmer B's code; in such a case, it is fair to claim that programmer B was actually more productive.

During my short research I came across an interesting study on the subject by Google. I thought that the study went a long way toward defining productivity. It does so by taking a a specific challenge (HPC in this specific case) and measured the actual progress students made on developing a certain task at different stages using various technologies. For example, it measured how much time was spent to develop a certain algorithm, the time it took to debug it, optimize it and get it working. One of the interesting outcomes of this study that seems to be generally applicable to many other technologies is that we often spend more time debugging and optimizing code than on the actual development. It was also found that the biggest bang for the buck is gained by shortening the transition between the various development cycles and less by reducing the time of each individual cycle. (I’ll get back to that later on in this post.)

Which platform is more productive?

On a question comparing the RoR and Garils platforms on stackoverflow (Is developer productivity higher on Ruby on Rails or Grails?) it became clear that answer to the question is always subjective as it depends on your existing skillset, legacy, framework etc. In other words, there is no clear winner and the answer can be different on a per case basis. This brings me to the point that its going be hard to claim that one platform is more productive than the other without taking the specific context in mind.

Also see Joseph Ottinger's post, "It's not about C++ and Java performance," which addresses a series of benchmarks people were trying to use to assert than Java was better than C++ and vice versa.

Defining Platform-Productivity

The definition that seem to resonate best for me was provided by Can Wolf on one of his comments on stackoverflow thread above:

Productivity is measured by units of features being delivered (not lines of code).

Can Wolf's comment in brackets is even more interesting. In many of the recent language debate centered around productivity we seem to be zeroing on the number of lines of code required to deliver a given functionality as the main measure for productivity. As I noted in the opening of this section reducing the number of lines of code covers only a narrow aspect of productivity - and not necessarily the best one.

Measuring the the time it takes to bring new feature from development into production is where I believe we should put the focus when we measure productivity as this is what really matters – by looking at that measure we may find that language choice has actually only a marginal contribution on our overall productivity in comparison to how well we can speed up the time it takes to go through the QA and Production cycles. We may also find that the right choice would actually be to combine different languages for different purposes.

In a case of a PaaS platform that means that a platform that makes it extremely easy to write your own hello-world application wouldn’t be considered productive if it fails to take you effectively through the entire debugging and optimization cycles into production.

Productivity vs. Control tradeoffs in PaaS

I started this post by quoting Gartner on the choices between Productivity and Control when looking into the two main categories in PaaS market.

I thought that Carlos Ble's post Goodbye Google App Engine (GAE) is a good example that illustrate why the initial perception behind GAE as simple platform that provides extreme productivity can be completely wrong.

..developing on GAE introduced such a design complexity that working around it pushes us 5 months behind schedule.

Part of the reason that brought Carlos through that experience IMO is that in the course of trying to make GAE extremely productive the owner made the platform too opinionatedto the point where you lose all the potential productivity gains by trying to adopt their model. In addition to that, with a platform like GAE you have very little freedom to leverage existing frameworks such as your own database, or messaging system, or any other third party service that can by itself be a huge contributor to productivity.

Instead, you're completely dependent on the platform provider's stack and pace of development and that in itself can work against agility and productivity in yet another dimension. In this specific example Carlos couldn’t use a certain version of a Python library which would have made his productivity higher and had to work around issues that were already solved elsewhere. This is a good example how the lack of flexibility lead to poorer productivity even in the case of simple applications.

Amazon PaaS (BeanStalk) vs Google App Engine – Productivity + Control

It's interesting to compare Google App Engine with Amazon Beanstalk. Amazon provides full control over almost every piece of their platform. You could choose your own operating system and integrate with external database and other services of your choice, etc. At the same time, Amazon provide layers of abstraction on top of their infrastructure through services like RDS, SimpleDB, SQS, MapReduce, and Beanstalk. In addition, to that they have an network of ecosystem PaaS providers such as Heroku, GigaSpaces and others that run on top of their IaaS services and complement their offering.

Open productivity

Amazon's approach is to provide what I would call Open Productivity. It enables you to start with extreme productivity where you would give up some degree of control at the earlier stages. Unlike GAE you're not locked into any of those layers of abstraction and you always have the choice to go another level down and pick your own database of choice, OS framework, etc.

Amazon's approach is pretty much aligned with the approach that we took in GigaSpaces in 2009 when we developed our first generation PaaS, as I noted in this post Google App Engine plus Amazon AWS: Best of both worlds. In other words, you don’t always have to trade extreme productivity for control. If the platform architecture is layered correctly it is possible to get a good degree of productivity and control as in the case of Amazon and GigaSpaces. Choosing between the degrees of productivity and control becomes a decision of which layer of abstraction to start with and not necessarily which platform to choose from.

Doing it the other way around - i.e. providing closed productivity first - without a good degree of control could lead to high productivity at the beginning but end up with low productivity as the project evolves as noted in Carlos' experience. In addition to that it is often much harder to take a closed productivity platform and open its control levels at a later stage if the platform was not designed for it from the start.

The caveat though is that it often takes more time to start with an open platform and build the layer of abstraction on top. In my view, this leads to a better evolution as you grow with your users and follow their demand rather than the other way around.

Bottom line – It’s not about productivity

After working on this short analysis, I came to the conclusion that since productivity is a broad term it would be wrong to think that our choice is to trade productivity with control when we select aPaaS or CEAP platforms.

To put it simply the main difference between the current class of aPaaS and CEAP is that most aPaaS are targeted toward the long-tail of web applications where CEAP is targeted to the more high end part of the spectrum.

All of the platform would try to provide extreme productivity for the type of applications that they are targeting.

A better way to make a platform choice would be based on the target applications that each platform is aimed for. For example I would look at GAE for extremely small Java or Payhon applications (widgets) or prototype, Heroku for small Ruby based App, Force for an application that needs to integrate more tightly with the salesforce ecosystem.

I would look for Beanstalk if I’m already running a simple java application and don’t plan to switch off of Amazon anytime soon. I would choose CEAP such as GigaSpaces in cases where I don’t want to be bound to a particular cloud provider and want the flexibility to run the same application on variety of environments private or public with or even without virtualization, In Java or .Net. The later would fit into most enterprises and the more high end SaaS providers where the first would probably more into end users applications.

Extensibility & Flexibility

One of the attributes of control is extensibility and flexibility. Extensibility and flexibility is another factor worth considering when selecting a platform. As it stands today, the various platforms tend to be quite different in the level of choice that they offer in language, frameworks, and the ability to plug-in your service of choice or using only parts of the platform (database, messaging, ..). As noted earlier in the GAE example, flexibility can actually lead to greater productivity.

What’s next.. toward a 2ndG PaaS

If we follow the evolution of aPaaS and CEAP solutions, we can see a consistent shift toward more open platforms than the one that was available in the first generation PaaS. With the emergence of more open infrastructures in the form of OpenStack, as well as the evolution of enterprise data centers toward hybrid clouds, we can expect even higher degree of flexibility not just on Amazon but on any cloud (private or public). The other thing that comes up often in many of the recent DevOps discussions is the demand to enable similar level of productivity and automation that Amazon (or GigaSpaces for that matter) provides for its own built-in services to external services of my choice.

Gartner's paper does a petty good job covering that shift in the PaaS market but the use of productivity as the differentiating factors can be misleading as I outlined in this paper.

The good news is that they were expected to get much better selection of choices between the various tradeoffs around productivity and control. I will dive into the details of that in a followup post.

PaaS trend discussion @ Cloud Connect

In this post I tried to lay out my own analysis on this matter. I’m very interested in other people experience on this subject so If this topic is of interest to you and you happened to be in CloudConnect next week I’d be happy if you would drop me an email (natis at gigaspaces dot com) or simply meet me at our CloudConnect booth (106).