March 10, 2009

Cloud Pricing and Application Architecture

I have been giving a lot of thought lately to cloud pricing. As an adviser to companies from both sides of the issue -- cloud (IaaS and PaaS) providers and cloud users (and potential users) -- I've had an interesting perspective on the issue, which I will discuss in this and several future posts.

Even in traditional data centers and hosting services, software architects and developers give some consideration to the cost of the required hardware and software licenses to run their application. But more often than not, this is an afterthought.

Last May, Michael Janke published a post on his blog which tells the story of how he calculated that a certain query for a widget installed in a web app -- an extremely popular widget -- cost the company $250,000, mainly in servers for the database and RDBMS licenses.

From my experience, companies rarely get down to the level of calculating the costs of specific features, such as a particular query.

So while this kind of metric-based and rational approach is always advisable, things get even more interesting in the cloud.

It's a pay-per-use model. Any optimization of resource utilization, big or small, will yield cost-savings proportional to the resource savings.

It is typically component based pricing. In AWS, for example, there are separate charges for CPU/hours, incoming and outgoing data charges for the servers, separate charges for the use of the local disk on the server (EBS), charges for storage (storage volume and bandwidth), charges for the message queue service, etc.

You get a real-time bill broken down by exact usage of components (at least on AWS).

In other words, whether you were planning on it or not, your real-time bill from your cloud provider will scream at you if a certain aspect of your application is particularly expensive, and it will do so at a very granular level, such as database reads and writes. And any improvements you make will have a direct result. If you reduce database reads and writes by 10% those charges will go down by 10%.

This is, of course, very different than the current prevailing model of pricing by discrete "server units" in a traditional data center or a dedicated hosting environment. Meaning, if you reduce database operations by 10%, you still own or rent that server. The changes will have no effect on cost. Sure, if you have a very large application that runs on 100 or 1,000 servers, tan such an optimization can yield some savings and very large-scale apps generally do receive a much more thorough cost analysis, but again, typically as an afterthought and not at such a granular level.

Another interesting aspect is that cloud providers may be offering a different costs structure than that of simply buying traditional servers. For example, they may be charging a proportionally higher rate for local disk I/O operations (Amazon charges $0.10 per million I/O requests to EBS). Something that would barely go into consideration when buying or renting discrete servers (whether physical or virtual).

Now software design decisions are part of the operations budget.
Every algorithm decision you make will have dollar cost associated with
it and it may become more important to craft algorithms that minimize
operations cost across a large number of resources (CPU, disk,
bandwidth, etc) than it is to trade off our old friends space and time.

Different cloud architecture will force very different design decisions. Under Amazon CPU is cheap whereas under [Google App Engine], CPU is a scarce commodity. Applications between the two niches will not be easily ported.

Todd recently updated this post with an example from a Google App Engine application in which:

With one paging algorithm and one use of AJAX
the yearly cost of the site was $1000. By changing those algorithms the
site went under quota and became free again. This will make life a lot
more interesting for developers.

So what architectural changes can you make to reduce costs on the cloud? Here's one example:

A while back I wrote a post about GigaSpaces and the Economics of Cloud Computing. GigaSpaces has been -- for those of you new to my blog -- my employer for the past 5 years. I gave five reasons for why GigaSpaces will save costs on the cloud, but what I discuss above adds a sixth one. Because GigaSpaces uses an in-memory data grid as the "system-of-record" for your application, it significantly reduces database operations (in some cases a 10-to-1 reduction). In AWS, this could reduce significant EBS and other charges. It also happens to be good architectural practice. For more on that see David Van Couvering's Why Not Put the Whole Friggin' Thing in Memory?

Taking this approach as an example, it could have saved a significant portion of Michael Janke's $250,000 query off the cloud, and perhaps an even bigger porportion on the cloud.

If anyone has other ideas on how architectural decisions could affect costs on the cloud, please share them in the comments.

Comments

Cloud Pricing and Application Architecture

I have been giving a lot of thought lately to cloud pricing. As an adviser to companies from both sides of the issue -- cloud (IaaS and PaaS) providers and cloud users (and potential users) -- I've had an interesting perspective on the issue, which I will discuss in this and several future posts.

Even in traditional data centers and hosting services, software architects and developers give some consideration to the cost of the required hardware and software licenses to run their application. But more often than not, this is an afterthought.

Last May, Michael Janke published a post on his blog which tells the story of how he calculated that a certain query for a widget installed in a web app -- an extremely popular widget -- cost the company $250,000, mainly in servers for the database and RDBMS licenses.

From my experience, companies rarely get down to the level of calculating the costs of specific features, such as a particular query.

So while this kind of metric-based and rational approach is always advisable, things get even more interesting in the cloud.

It's a pay-per-use model. Any optimization of resource utilization, big or small, will yield cost-savings proportional to the resource savings.

It is typically component based pricing. In AWS, for example, there are separate charges for CPU/hours, incoming and outgoing data charges for the servers, separate charges for the use of the local disk on the server (EBS), charges for storage (storage volume and bandwidth), charges for the message queue service, etc.

You get a real-time bill broken down by exact usage of components (at least on AWS).

In other words, whether you were planning on it or not, your real-time bill from your cloud provider will scream at you if a certain aspect of your application is particularly expensive, and it will do so at a very granular level, such as database reads and writes. And any improvements you make will have a direct result. If you reduce database reads and writes by 10% those charges will go down by 10%.

This is, of course, very different than the current prevailing model of pricing by discrete "server units" in a traditional data center or a dedicated hosting environment. Meaning, if you reduce database operations by 10%, you still own or rent that server. The changes will have no effect on cost. Sure, if you have a very large application that runs on 100 or 1,000 servers, tan such an optimization can yield some savings and very large-scale apps generally do receive a much more thorough cost analysis, but again, typically as an afterthought and not at such a granular level.

Another interesting aspect is that cloud providers may be offering a different costs structure than that of simply buying traditional servers. For example, they may be charging a proportionally higher rate for local disk I/O operations (Amazon charges $0.10 per million I/O requests to EBS). Something that would barely go into consideration when buying or renting discrete servers (whether physical or virtual).

Now software design decisions are part of the operations budget.
Every algorithm decision you make will have dollar cost associated with
it and it may become more important to craft algorithms that minimize
operations cost across a large number of resources (CPU, disk,
bandwidth, etc) than it is to trade off our old friends space and time.

Different cloud architecture will force very different design decisions. Under Amazon CPU is cheap whereas under [Google App Engine], CPU is a scarce commodity. Applications between the two niches will not be easily ported.

Todd recently updated this post with an example from a Google App Engine application in which:

With one paging algorithm and one use of AJAX
the yearly cost of the site was $1000. By changing those algorithms the
site went under quota and became free again. This will make life a lot
more interesting for developers.

So what architectural changes can you make to reduce costs on the cloud? Here's one example:

A while back I wrote a post about GigaSpaces and the Economics of Cloud Computing. GigaSpaces has been -- for those of you new to my blog -- my employer for the past 5 years. I gave five reasons for why GigaSpaces will save costs on the cloud, but what I discuss above adds a sixth one. Because GigaSpaces uses an in-memory data grid as the "system-of-record" for your application, it significantly reduces database operations (in some cases a 10-to-1 reduction). In AWS, this could reduce significant EBS and other charges. It also happens to be good architectural practice. For more on that see David Van Couvering's Why Not Put the Whole Friggin' Thing in Memory?

Taking this approach as an example, it could have saved a significant portion of Michael Janke's $250,000 query off the cloud, and perhaps an even bigger porportion on the cloud.

If anyone has other ideas on how architectural decisions could affect costs on the cloud, please share them in the comments.