March 10, 2009

I have been giving a lot of thought lately to cloud pricing. As an adviser to companies from both sides of the issue -- cloud (IaaS and PaaS) providers and cloud users (and potential users) -- I've had an interesting perspective on the issue, which I will discuss in this and several future posts.

Even in traditional data centers and hosting services, software architects and developers give some consideration to the cost of the required hardware and software licenses to run their application. But more often than not, this is an afterthought.

Last May, Michael Janke published a post on his blog which tells the story of how he calculated that a certain query for a widget installed in a web app -- an extremely popular widget -- cost the company $250,000, mainly in servers for the database and RDBMS licenses.

From my experience, companies rarely get down to the level of calculating the costs of specific features, such as a particular query.

So while this kind of metric-based and rational approach is always advisable, things get even more interesting in the cloud.

It's a pay-per-use model. Any optimization of resource utilization, big or small, will yield cost-savings proportional to the resource savings.

It is typically component based pricing. In AWS, for example, there are separate charges for CPU/hours, incoming and outgoing data charges for the servers, separate charges for the use of the local disk on the server (EBS), charges for storage (storage volume and bandwidth), charges for the message queue service, etc.

You get a real-time bill broken down by exact usage of components (at least on AWS).

In other words, whether you were planning on it or not, your real-time bill from your cloud provider will scream at you if a certain aspect of your application is particularly expensive, and it will do so at a very granular level, such as database reads and writes. And any improvements you make will have a direct result. If you reduce database reads and writes by 10% those charges will go down by 10%.

This is, of course, very different than the current prevailing model of pricing by discrete "server units" in a traditional data center or a dedicated hosting environment. Meaning, if you reduce database operations by 10%, you still own or rent that server. The changes will have no effect on cost. Sure, if you have a very large application that runs on 100 or 1,000 servers, tan such an optimization can yield some savings and very large-scale apps generally do receive a much more thorough cost analysis, but again, typically as an afterthought and not at such a granular level.

Another interesting aspect is that cloud providers may be offering a different costs structure than that of simply buying traditional servers. For example, they may be charging a proportionally higher rate for local disk I/O operations (Amazon charges $0.10 per million I/O requests to EBS). Something that would barely go into consideration when buying or renting discrete servers (whether physical or virtual).

Now software design decisions are part of the operations budget.
Every algorithm decision you make will have dollar cost associated with
it and it may become more important to craft algorithms that minimize
operations cost across a large number of resources (CPU, disk,
bandwidth, etc) than it is to trade off our old friends space and time.

Different cloud architecture will force very different design decisions. Under Amazon CPU is cheap whereas under [Google App Engine], CPU is a scarce commodity. Applications between the two niches will not be easily ported.

Todd recently updated this post with an example from a Google App Engine application in which:

With one paging algorithm and one use of AJAX
the yearly cost of the site was $1000. By changing those algorithms the
site went under quota and became free again. This will make life a lot
more interesting for developers.

So what architectural changes can you make to reduce costs on the cloud? Here's one example:

A while back I wrote a post about GigaSpaces and the Economics of Cloud Computing. GigaSpaces has been -- for those of you new to my blog -- my employer for the past 5 years. I gave five reasons for why GigaSpaces will save costs on the cloud, but what I discuss above adds a sixth one. Because GigaSpaces uses an in-memory data grid as the "system-of-record" for your application, it significantly reduces database operations (in some cases a 10-to-1 reduction). In AWS, this could reduce significant EBS and other charges. It also happens to be good architectural practice. For more on that see David Van Couvering's Why Not Put the Whole Friggin' Thing in Memory?

Taking this approach as an example, it could have saved a significant portion of Michael Janke's $250,000 query off the cloud, and perhaps an even bigger porportion on the cloud.

If anyone has other ideas on how architectural decisions could affect costs on the cloud, please share them in the comments.

January 15, 2009

Michael Fitzgerald wrote a nice piece on cloud computing, entitled When the Forecast Calls for Clouds, which was published on Inc. Magazine this week (online and print). The article gives anice overview of the topic and then goes into three specific use cases -- on-demand unexpected scalability, infrequent large batch jobs and volatile (seasonal) loads. It gives actual customer examples and discussing the technologies they used such as GigaSpaces and RightScale.

December 17, 2008

Dekel Tankel sent me a link to a recorded demo of the GigaSpaces CloudTools that he and Owen Taylor prepared. The demo shows a stock ticker application updated in real-time running on Amazon EC2 (using dummy data). In the demo, Owen shows how the application can auto-scale based on increased load and how it can heal itself in case any of the Amazon instances fail.

First, Owen shows how the application can be deplyed on EC2 with essentially a single click.

The application then starts running with 3 web containers and 4 processing service containers (2 partitions - or "shards" - each with a synchronous backup). As Owen simulates more concurrent users, the system reaches a pre-set SLA threshold on CPU utilization and automagically launches two additional instances on EC2. All of this happens with the application continuously running so that the end user is unaware of this happening in the background. Owen then kills ones of the partitions and shows how the backup becomes the new primary and another backup is launched to take its place. Again, in a way that is unnoticeable by the user.

Owen also demonstrates how you can move services to different EC2 instances by literally dragging and dropping them in the GigaSpaces UI.

Very impressive stuff and a very rich UI to give you an exact picture of what's going on.

November 14, 2008

Microsoft will be making an announcement in San Francisco about the launch of Online Services. I've been invited to attend that as well as a Blogger Roundtable that Microsoft is hosting for cloud computing bloggers. Should be interesting.

On Tuesday (10 AM PST), GigaSpaces and CohesiveFT are hosting a joint webinar entitled Cloud Enablement with Security and Control. The highlight of this webinar is a wicked cool live demo of running a web application across two public clouds (Amazon EC2 and Flexiscale) with the ability to scale out on demand based on SLAs, fail-over, self-healing -- all in a secure environment. So as the Cohesive guys wrote on their blog, make sure to brush up on your cloud computing buzzwords, because we will show it all: cloudbursting, cloud storming, cloud spanning, virtual private clouds, hybrid clouds, interclouds -- you name it.

Sys-con's Cloud Computing Expo will take place in San Jose. Seems like everyone is coming in to the Bay Area for this one, so I am looking forward to meet a lot of the folks active in cloud computing at this event. There is also talk of another Cloud Computing Interoperability Forum get together around this event as well as a Silicon Valley CloudCamp.

So it's going to a fun week. If anyone wants to get together at one of these events, just let me know in the comments or via twitter.

November 07, 2008

Shay Hassidim, deputy CTO at GigaSpaces, posted an impressive write-up of a benchmark the team ran on Amazon EC2. What's nice about it is that they took a standard web app, in this case the Spring PetClinic, and dropped it into the GigaSpaces container, achieving instant low-latency and scalability, with out-of-the-box load-balancing and fail-over. Extremely cool.

The other components in the app include standard and open source components: Jetty, MySQL, Apache load-balancer, JMeter and Ant.

November 03, 2008

Last week we made a very exciting announcement about Miwok Airways selecting GigaSpaces as the application server for running their reservation and pricing engine which will run on EC2. This is a great case study for cloud computing.

For one thing, you have to love the fact that it is cloud computing used for a business that literally runs in the clouds (the actual meteorological kind). Second, it is an on-demand compute infrastructure for a business that has an on-demand business model in the real world. A perfect fit.

There is a great piece in the LA Times that describes Miwok, but let me give you a brief description from the software application angle.

The idea is that for so-called ultra-short flights (typically, less than 250 miles), as a traveler you have a terrible dilemma: use commercial airlines or drive your car. I don't need to tell you the hassle and costs involved in both options these days.

Miwok overcomes the hassles of these options by providing you with an on-demand "air taxi" service. You book your flight when you need it. So, say, you want to fly from Santa Monica to Orange County or Palm Springs. You go to the Miwok web site and say when and where, you get pricing and you can book the flight on the spot. The flight you are booking is for a private Cirrus SR22. You can park 100 feet from the airplane itself (at a local airport, not just the major ones) and you don't need to go through security (imagine that!). All of this at the same cost of a commercial flight.

But here's the part I really like:You can connect to other people via Miwok's own social network, or through a Facebook app (and others to come). As the Cirrus can seat 3 passengers, you can split the costs with other passengers who need to make the same trip. So the flight could end up significantly cheaper than a commercial airline.

Think about it: This is the exact opposite pricing model of big airliners, where the more people go on a flight, the price goes up. From a marketing point of view, this has tremendous viral potential.

One of the biggest technology (technology as in software, not aviation) challenges Miwok was facing was developing an extremely sophisticated real-time pricing engine. It needs to take many parameters into account to offer you a price on the spot, including location, path, season, date, time of booking, number of passengers and several other criteria. It needs to be able to grow and shrink on-demand, especially because of the social networking and viral effect.

The architecture Miwok selected uses MySQL and Hibernate for the persistence layer, but the database is not used as the system of record for calculation and reservations. Instead they use GigaSpaces' in-memory data grid, which gives you in-memory speeds and can also grow and shrink dynamically in the EC2 environment. The benefit for Miwok is that having very little advance knowledge on the traffic they will get, and expecting extreme peaks and troughs in activity, they don't need to pre-plan and invest upfront in the infrastructure. They use GigaSpaces and EC2 and will only pay for hardware and software on a per-use basis -- when and if they actually need it.

They also use GigaSpaces XAP (which includes the in-memory data grid) as the container for the business logic, written in Java, and as a bus for integrating the various underlying services involved in generating pricing and booking reservations.

In short, on-demand application scalability for an on-demand air travel service.

October 05, 2008

As Dekel has already written on his blog, we are releasing our cloud framework for private beta. This was intended as a very "soft launch" sent to about 30 customers and partners already using GigaSpaces on public clouds (such as EC2, GoGrid, Joyent and Flexiscale) to help us test it out and provide feedback on how we can improve it.

But given that James Urquhart and a couple of others have already given it some more exposure, I might as well give some more public details about it.

First, let me clarify that GigaSpaces XAP, as is, is not only ready for the cloud now, but is already being used as a "cloud application server" on EC2 by several large and start-up companies. Even before the advent of public clouds, such as Amazon EC2, GigaSpaces was already being used by corporations in on-premise grids and by Web 2.0 start-ups for handling peak loads and the need to rapidly scale.

Because it allows on-demand scalability, provides transparent fault-tolerance, enables data affinity and is extremely high-performance, as a technology, GigaSpaces XAP is ideally suited for public cloud environments. But there were a few pieces missing in terms of the "customer experience" -- and we are working hard to provide them.

One of them was pricing suitable for the on-demand model. We had already taken care of that as part of the announcement of GigaSpaces availability on EC2 back in June by providing a pay-per-use pricing model (CPU/hour).

Another one was simplifying and automating the deployment of applications to a public cloud environment. That's what the "cloud framework" is about. It enables rapid cluster configuration and "one-click" deployment on public clouds through a Web interface. The current version supports EC2, but was built in a way that allows us to easily add other public clouds, which we will gradually implement. To learn more, see the documentation for the cloud framework.

If you are interested in or considering using GigaSpaces on EC2, if you are part of our Start-Up Program, or an existing "on-premise"customer, please let us know if you'd like to participate in the private beta (email cloud at gigaspaces.com). You'd first need to sign up for a GigaSpaces EC2 license.

The official public beta release of the cloud framework will be announced soon. Shortly thereafter I hope to announce tighter integration with our partners at RightScale, CohesiveFT and others. And there's more to come...

September 29, 2008

On-Demand Enterprise just published a piece I wrote about some of the lessons learned from the current financial crisis and cloud computing. Although there is of course no direct connection, there are some interesting conclusions.

Here's an excerpt:

Although I'm not suggesting that it was the crux of the current crisis, one of the questions that has come up recently is why Wall Street's massive investments in value-at-risk analysis system did not curb the downfall. We have not heard (at least not yet) that these risk analyses were blinking with red lights and were ignored. In an HPCwire piece entitled "The Quantitative Models Tanked Too," editor Michael Feldman tackles this issue. Although not the only explanation, one of the issues raised in the article is the fact that the models used were over-simplified. Feldman explains that "in some cases, limits in computational power made these simplifications necessary so that the valuation models could be run."

Done right, cloud computing offers nearly limitless computing power. Had they used a cloud service such as EC2, and software that is built to scale on the cloud, the quants on Wall Street could have easy, cheap and on-demand access to massive computation power.

Some other exciting things about this exercise in the lab was that it was, in general, easy. We added and removed nodes, scaled linearly, pulled nodes out, re-deployed the application in seconds to minutes, we processed large amounts of data, and we did all this in the cloud with really minimal efforts.

Some formal announcements about our partnership with Joyent coming soon.

The announcement by Oracle this week that they are "on the cloud" was once again quite an amusing piece of public relations. I don't know if Oracle is serious when they make these announcements or if they are secretly smiling to themselves.

Not everybody ate it up (although many did). Vinnie Mirchandani, a keen observer of the enterprise software space, wrote: "It will not reduce costs of Oracle licensing or even worse, the annual maintenance cost."

Oracle seemed to have missed one of the main points of cloud computing -- that it's an on-demand approach and that includes software licensing costs. In case you didn't notice, Oracle is not offering pay-per-use pricing for the cloud. All they are saying is that you can use the regular perpetual, upfront, fully-paid software license on EC2. Big Whoop.

The other major problem with the Oracle announcement is that their software really isn't suitable for a cloud environment. It can't grow and shrink on-demand, it's not self-healing, it's dependent on centralized components, it's complex. It's just not right.

Another interesting comparison is revealed by this amusing press release from Oracle about a start-up customer named Qtrax. Look at the stack Oracle managed to sell them to build their application. I put in brackets the price per CPU for each Oracle product from the official price list, and I quote:

"Qtrax's implementation includes Oracle Database [$17.5k to $47.5k], Oracle Real Application Clusters [$23k], Oracle Enterprise Manager [$3.5 to $20k+] and components of Oracle Fusion Middleware [?], including Oracle Application Server [$10k to $30k] and Oracle Coherence [$4k to $25k]. With this software now in place, Qtrax will have the ability to support millions of concurrent users [they better!]."

On top of these numbers (which total in the range of $58k to $145.5k per CPU1)add a 22% annual support fee. As these are perpetual licenses, let's break the license numbers to an hourly rate by assuming 24/7 for 3 years: we get $2.20 to $5.54. Even if you decide to be generous and divide by 4 years, you get $1.65 to $4.15. Now, let's not forget that Oracle doesn't actually offer any special pricing for it's products on EC2 (i.e., an hourly rate)2 so you would have to buy the licenses upfront, as Qtrax apparently did.