Every Halloween, we host a party for family and close friends. This tradition was actually started by my mother-in-law many years ago; when we inherited her house a few years ago after her passing, we inherited the annual party too.

As the man of the house, it became my responsibility to setup the Haunted Graveyard in the backyard. 2010 is my third attempt. The first, in 2008, was an unmitigated disaster, although in my defense I only learned that I was responsible for it a few hours before the party, and since I had never done it before I had no supplies. I was spared utter humiliation only because the audience at the time was only 3-5 years old. Last year was much better and received many compliments, although in my opinion it still left much to be desired.

Here are some pictures of the construction of my 2010 Haunted Graveyard.

The grassy area in the backyard is about 15 feet wide and 30 feet long. This year I decided to use some 2 foot tall fencing I have to mark out an inverted U-shaped pathway through the graveyard, with the characters and decorations inside the U, on the sides, and on the opposite side of the path past the end of the U. This worked better than the open area I had last year, because it kept people moving along, so more people could experience the graveyard more quickly, and it helped keep the kids from hitting the characters in the graveyard.

I bought the fencing at the hardware store a few years ago and store it in the side yard, so it has a great weathered, run-down look, with some of the fenceposts missing. It really helps set the tone.

With the fencing all laid out it was time to bring in the decorations. First was a hanging ghost guy. He’s supposed to light up and make weird noises, but after a year in storage he had stopped working. He still looks good though, so I decided to use him.

Next was a standing mummy with light-up eyes that makes creepy noises. I got him cheap the day after Halloween last year, but I hadn’t tried him out until now. I think he worked pretty well. He didn’t stay in that position though, as you can see in later pictures. I also brought out our collection of tombstones. These are great because they fill a lot of space, and with appropriate lighting on them they actually look pretty spooky at night. They’re just made of styrofoam, so I used tent stakes behind them, with the hook actually digging into the foam, to hold them up.

Then I put in an animated ghost guy with light-up eyes and, of course, creepy noises. He’s sitting down in this picture, between the mummy and the tombstones. Behind the far fence, past the mummy, you can just make out the yellow eyes of a giant inflatable spider that I borrow from my sister-in-law for the event.

My 16-year-old daughter helps me set up the graveyard. Here we’ve put some cobwebs on the fence, for ambiance.

We also put some plain white masks in the bushes on the left side. Those turned out really well, since they are just featureless, eyeless faces. With a blacklight behind them they glow in a very eerie way too. And, they were the cheapest item out there, at just a couple bucks each.

The next addition was a pop-up zombie character, on the right hand side. Hard to see since he’s laying down in this picture. You can see the mummy in a new position further back too. That’s still not his final resting place (ha!) though.

Everything’s in place here. The mummy has been moved, again, to the right side of the graveyard. The animated ghost is standing up in this picture so he’s easier to see.

Here’s a picture of the graveyard after dark. There’s a fog machine in the back, making the smoke you see. The yellow glow in back is from the giant spider. The blue glow on the left is from the masks in the bushes.

A couple details are hard to see in those pictures, but I was really pleased with how they turned out so I got close-up shots of them. First is a collection of zombie Barbies. I got the idea here, which I must have found via Reddit. It’s really easy to make zombie Barbies — just grab a bunch of old Barbie dolls (my 7 year old had about 50, so she was happy to sacrifice a few for the cause), spray paint them white, then use a Sharpie to blacken the eyes and make scars. Everybody agreed that the Barbies were super creepy:

Next are these little skulls. These were just some cheap styrofoam skulls that came in a multi-pack of “graveyard stuff” from the Halloween super store. Last year I just arranged them around the tombstones, but this time I got a little more creative.

I also really liked this flying ghost. He’s attached to a cord that’s stretched taut between a tree and our house. He moves back-and-forth along the cord, making weird noises as he goes.

Finally, here’s a shot of one of the creepy masks I mentioned. Remember, this is just a $3 mask and a black light. Hard to beat that bang-to-buck ratio!

Here’s the obligatory “action shot” of the graveyard from the party. You can just make out my daughter in the back. She likes to watch the kids go through the graveyard, and she makes loud noises now and then to trigger the sound-activated characters, if the kids aren’t noisy enough themselves.

If success is measured by the number of kids that are too scared to finish walking through the graveyard, or who refuse to go into the backyard afterwards, then this year’s graveyard was a smashing success. But there’s still plenty of room for improvement. Next year I need more lights, to better highlight some of the characters. And I need some kind of curtain or partition down the middle of the central area, so that you can’t see what’s on the right side of the graveyard as you’re walking on the left side, maybe something like this wall of fog. Last, I think I can do more with the zombie Barbies — perhaps a zombie Barbie mansion, complete with a partially eaten Ken?

For us, the decision was easy. The public cloud is unsuitable for three reasons: platforms, bandwidth and money. First, the public cloud doesn’t support the platforms we need for testing. Second, uploading data to the public cloud takes way too long by today’s agile, continuous development standards. Finally, and probably most interesting to you, the public cloud is surprisingly expensive. In fact, I estimate that the public cloud would cost us more than twice as much as our private cloud, every year.

Public clouds don’t support all of our platforms

My product is supported on a smörgåsbord of x86-based platforms — various incarnations of Windows, from XP to Vista to Windows 7; and a variety of Linux distributions from RHEL4 to Ubuntu 10. Our quality standards demand that we run the platform-dependent portion of our test suite on every supported platform. Pretty standard stuff, I imagine. Too bad for us then that you can’t run XP, Vista or 7 in the cloud (see also here and here).

Bandwidth to the public cloud stinks

My company is connected to the Internet via a puny 10 Mbit EoC pipe. In comparison, our internal network uses fat GigE connections. Under ideal conditions, it takes 100x longer to transfer data to the public cloud than to our private cloud. Think about that for a second. Heck, think about it for 600 seconds: that’s how long it would take me to upload 750 MB, the total size of our install packages. And that’s best case. When’s the last time you hit the advertised upload speed on your Internet connection?

The public cloud is expensive

Many people assume that the public cloud will be cheaper than a private cloud. A day’s worth of compute time on Amazon EC2 costs less than a Starbucks latte, and you have no upfront cost, unlike a private cloud which has substantial upfront capital expenses. But it pays to run the numbers. In our case, the public cloud is more than twice the cost of a private cloud:

I split the costs into two buckets, because we have two fundamentally different usage models for the VM’s in our cloud. First are the systems used by our continuous integration server to run automated tests. Each CI build uses 12 Linux and 8 Windows systems, one for each supported platform. Our testing standards require that those systems are dual-core, but the work load is light since they just run unit tests and simple system tests. We have three such blocks of 20 systems, so we can run three CI builds simultaneously. Because the CI server never sleeps, these systems are always on.

Second are the systems used day-to-day by developers for testing and debugging. Each developer may use just a few systems, or more than a dozen depending on their needs. It’s hard to pin down the precise duty cycle, but eyeballing data from our cloud servers I estimate we have about 80 systems in use per day, for about 8 hours each. They are split roughly 50/50 between Linux and Windows. Two-thirds of the systems are single-core, and the rest are at least dual-core.

Pricing the public cloud

Once you know the type and quantity of VM’s you need, and for how long, it’s straightforward to compute the cost of the public cloud. Because I’m most familiar with Amazon EC2, I’ll use their pricing model. For our CI systems, we would use a mix of Medium and Large instances to match our requirements for multi-core and 64-bit support. Because they are always-on, we’d opt to use the Reserved instance pricing, which offers a lower hourly cost in exchange for a fixed up-front reservation fee.

For developer systems, we would use On-Demand instances, with a mix of Small and Large instances:

Continuous integration systems

Medium instances

Annual fee

=

$15,015.00

(33 systems at $455 per system)

Linux usage fee

=

$14,716.80

(21 systems, 24 hours, 365 days, $0.08 per hour per system)

Windows usage fee

=

$15,242.40

(12 systems, 24 hours, 365 days, $0.145 per hour per system)

Large instances

Annual fee

=

$24,570.00

(27 systems at $910 per system)

Linux usage fee

=

$21,024.00

(15 systems, 24 hours, 365 days, $0.16 per hour per system)

Windows usage fee

=

$25,228.80

(12 systems, 24 hours, 365 days, $0.24 per hour per system)

Subtotal

=

$115,797.00

Development systems

Small instances

Linux

=

$4,940.00

(26 systems, 8 hours, 250 days, $0.095 per hour per system)

Windows

=

$6,760.00

(26 systems, 8 hours, 250 days, $0.13 per hour per system)

Large instances

Linux

=

$10,640.00

(14 systems, 8 hours, 250 days, $0.38 per hour per system)

Windows

=

$15,600.00

(14 systems, 8 hours, 250 days, $0.52 per hour per system)

Subtotal

=

$37,940.00

Total

=

$153,737.00

Pricing the private cloud

It’s somewhat harder to compute the cost of a private cloud, because there is a greater variety of line-item costs, and they cannot all be easily calculated. The most obvious cost is that of the hardware itself. We use dual quad-core servers which cost about $3,000 each. Six of these servers host our CI VM’s. Note that this is only 48 physical cores, but our CI VM’s use a total of 120 virtual cores. This is called oversubscription, and it works because the load on the virtual cores is light — if each virtual core is active only 30-50% of the time, then one physical core can support 2-3 virtual cores.

We use 15 servers for our on-demand development VM’s. Unlike the CI systems, these VM’s are subject to heavy load, so we cannot oversubscribe the hardware to the same degree.

The next obvious cost is the electricity to power our servers, and of course the A/C costs to keep everything cool. Our electrical rate is about $0.17 per KWh, and we estimate the cooling cost at about 50% of the electrical cost.

Finally, we must consider the cost to maintain our 21 VM servers. To compute that amount, we must first determine how much of a sysadmin’s time will be spent managing these servers. Data from multiplesources shows that a sysadmin can maintain at least 100 servers, particularly if they are homogeneous as these are. Our servers therefore consume at most 21% of a sysadmin’s time.

Next, we have to determine the cost of the sysadmin’s time. I’m not privy to the actual numbers, but salary.com tells me that a top sysadmin in our area has a salary of about $90,000. The fully loaded cost of an employee is usually estimated at 2x the salary, for a total cost of $180,000 per year.

Here’s how it all adds up:

Continuous integration systems

Hardware

=

$6,000.00

(6 dual, quad-core systems at $3000 each, amortized over 3 years)

Personnel

=

$10,800.00

(6% of a fully-loaded sysadmin at $180,000)

Electricity

=

$3,082.65

(6 systems x 345w x 24 hours x 365 days x $0.17 per KWh)

Cooling

=

$1541.33

(50% of electricity cost)

Subtotal

=

$21,423.98

Development systems

Hardware

=

$15,000.00

(15 dual, quad-core systems at $3000 each, amortized over 3 years)

Personnel

=

$27,000.00

(15% of a fully-loaded sysadmin at $180,000)

Electricity

=

$7,706.61

(15 systems x 345w x 24 hours x 365 days x $0.17 per KWh)

Cooling

=

$3,853.31

(50% of electricity cost)

Subtotal

=

$53,559.92

Total

=

$74,983.90

Why is the public cloud so expensive?

I wasn’t surprised that the public cloud was more expensive, but I was surprised that it was that much more expensive. I had to figure out why it was so, and I think it comes down to two factors. First, we need 64-bit dual-core VM’s for our tests, but 64-bit support is only available on Large or better instances, which are at least 2x the cost of Medium instances. We would be forced to pay for more (virtual) hardware than we need.

Second, we benefit significantly by oversubscribing the hardware in our private cloud with 2.5 virtual cores per physical core. I have no doubt that Amazon is doing the same thing behind the scenes, but — and this is the real kicker — virtual cores in the public cloud are priced assuming a one-to-one virtual-to-physical ratio. Put another way, even though the public cloud provider is certainly oversubscribing their hardware and you’re only getting a fraction of a physical core for each virtual core, you still have to pay full price for those virtual cores. For all that increased hardware utilization is touted as a benefit of cloud computing, it only applies if you own the hardware.

Does it ever make sense to use the public cloud?

The results here are pretty dismal, but I think there are situations where the public cloud is the best choice. First, although private is cheaper in the long term, it requires a substantial upfront investment just to get off the ground — $63,000 for the hardware in our case. You may not have that kind of capital to work with.

Second, if your needs truly are “bursty”, the public cloud on-demand pricing is actually pretty competitive. Of course, you have to be really good about managing those VM’s — if you leave them powered on but idle, you still pay usage fees, which will quickly inflate your expenses.

Finally, if you’re just “testing the waters” to see if cloud computing will work for you, it’s definitely cheaper and easier to do that with a public cloud.

Private clouds for dev/test

Our private cloud has been a powerful enabling technology for my team. If you’re in a similar situation, you should seriously consider private versus public. You might be surprised to see how favorably the private cloud compares.

Like this:

Cloud computing has been all the rage lately. But most of the attention has focused on deployment of applications in the cloud, or at best, development of applications for the cloud. I haven’t seen much discussion about the ways that cloud computing can support development of “traditional” software — all that stuff that is not destined for cloud deployment.

Over the past two years, my engineering team has gradually migrated from a large collection of physical servers to a private development cloud, which has enabled us to support a rapidly increasing matrix of platforms and also improved development efficiency and developer happiness. I thought I’d share our experiences.

The bad old days

Two years ago, my development team had a server room stuffed full of rack-mounted computers — literally hundreds of 1U systems. At one point we determined that we had about 40 computers per developer. Seems outrageous, right? But we develop cluster-based software, so for a full system test (involving all major components) a developer needs at least three machines, and often 10 or more. And that was just for one developer, working on one platform. Consider that we have ten development and QA engineers, and that we support over 20 platforms (different flavors/versions of Windows and Linux), and you can see how quickly it adds up, even accounting for systems set up to dual- (or triple-, or quadruple-) boot.

This arrangement was functional, but just barely. The server closet was a nightmare of network, power and KVM cables. We had to retrofit it twice, once to bring in more power, and again to bring in more cooling. Maintaining the systems was a full-time job and then some: keeping everything up-to-date on patches, replacing dead or too-small disk drives, protecting against viruses. And just imagine the nightmare when a new OS came out — start with a cluster of machines configured to dual-boot XP and Server 2003, and then you want to add Server 2008 to the mix. First you have to repartition the drive, assuming it’s even big enough to accommodate all three. Then you have to reinstall the original two OS’s, and finally you can install the new OS. Multiply that by the number of machines and you’re looking at days or weeks of effort. Even if you use something like Ghost, inevitably you have a hodgepodge of hardware configurations, so you need to make multiple images.

And even with all the systems we had, we never seemed to have enough. Or rather, never enough of the right kind — when I needed to test on Windows, we only had Linux hosts available, or when I needed a multi-core system (which made up only a fraction of our total), I found they were all in use by my coworkers. Ironically, though we had hundreds of systems, most sat idle much of the time.

We had reached the point of crisis: we couldn’t squeeze any more systems into our server closet, nor any more operating systems on the systems we had. Something had to change.

So we got rid of all our servers.

Creating a private development cloud

Well ok, not all of them. Actually, we replaced our cornucopia of cheap computers with a couple dozen beefy servers — thanks to advances in hardware, we were able to get inexpensive 2u systems with 8 cores (dual-quads), a boatload of memory and large, fast disks. Then we put VMWare’s ESX Server on them, and started using virtual machines instead of physical for the bulk of our development and testing needs. We didn’t realize it at the time, but we had created a private development cloud.

This approach has a lot of advantages over physical systems, which will be familiar to anybody who’s followed the cloud computing trend:

Increased utilization: each VM server hosts 10-12 virtual machines; although many VM’s are idle at any given time, others are not, ensuring that there is at least some load on each of the physical cores that we do have.

Greater flexibility: each “slot” in our virtual infrastructure can host a VM of whatever flavor we need. It doesn’t matter if there are 10, 20 or 100 Linux VM’s already deployed by other developers: if I need a Linux system, I can get it.

Elasticity: I can grow and shrink my virtual cluster as needed, at the touch of a button. I no longer need to haggle with my coworkers for resources, or wait patiently for somebody to finish their tests.

Self-service and ease of use: adding a new test system in our old infrastructure was a major chore: requisition hardware, get IT involved to find a place to rack it and plug it in and install the OS or OSes. Best case scenario: days from the time I determine I need a new system to the time I can use it. With our private cloud, it’s literally as easy as visiting a web page, choosing the OS, the number of cores and amount of RAM and clicking a button. Ten minutes later I’m ready for business.

Reduced IT costs: instead of managing hundreds of computers, our IT department only maintains about 20 VM servers (which are all identical), and about 20 VM “templates” from which we create any number of VM instances. If a VM goes bad for any reason, we just discard it and regenerate from the template — nobody wastes time trying to “fix” a broken VM. Adding support for a new OS is dramatically easier: setup the single VM template with the new OS and publish it for use.

Lessons learned

Although things are working pretty well now, we had our share of difficulties in the transition. We didn’t have anybody in house with any particular experience with ESX Server, so there was a learning curve for that. One particular problem we had was figuring out how much disk space to allocate to each ESX server — we foolishly tried to lowball that axis, and we paid for that mistake with VM server downtime (and thus reduced cloud capacity) each time we realized we still had not allocated enough space. In short: get as much disk as you possibly can.

Another lesson learned was to avoid using the “Undeploy and save state” feature of ESX Server. That’s conceptually similar to suspending a system, versus powering it down, and it chews up storage space on the ESX Server, often for no good reason. And, we learned to avoid making clones of templates when deploying VM instances, again because it chews up storage space.

We also found that putting “too many” VM templates on a single disk partition caused significant filesystem lock contention, so we had to do some trial-and-error experimentation to find the “magic number” of templates per partition (it’s about 10, by the way).

Finally, we’ve found that although virtual machines fill the majority of our needs, we still need some physical machines, particularly for performance testing. Virtual machines are terrible for performance testing, first because it’s difficult to control the entire environment while running tests — the so-called noisyneighbors problem. Second, performance analysis, already an often arduous task, becomes nearly impossible by the addition of the extra complexity introduced by virtualization: not only do you need to be mindful of what’s happening on your VM, you must be aware of what’s happening on the VM server, and possibly what’s happening on other VM’s hosted on the same server.

Cloud computing for traditional dev/test

It was a bit of a rocky road to get where we are today, but I can say with confidence now that we absolutely made the right decision. Cloud computing is not just for scaling massive web applications: it is just as useful in traditional software development and test environments.

I wish I could quantify the positive impact on our development with data on improved quality or efficiency or reduced development time. I can’t. But I can make some concrete statements about the benefits we’ve enjoyed:

Most importantly, without our private cloud we would have been unable to grow our support matrix to the 20+ platforms it includes today.

Second, we reduced our IT cost by at least 6x, by reducing the number of systems that IT manages from (at least) 115 to just 21.

Finally, we cut our electrical bill by at least 5x from $50,000 per year (115 physical servers, 300 watt power supplies, running 24/7, at a cost of $0.17 per KWh), to just $11,000 per year (21 VM servers, 345 watt power supplies). Likewise, we reduced our cooling costs from about $25,000 per year to about $6,000 per year.

Beyond that, all I have is anecdotal evidence and the assurances of my teammates that “things are way better now.” For my part, the fact that I no longer have to arm wrestle my coworkers for access to resources makes it all worthwhile.