In this three part series, Peter Koen shares his paper titled "Move to the Blue Cloud and get Green". He discusses how Microsoft Windows Azure can help your IT to increase efficiency and lower TCO while achieving incredible performance. And all that while doing something good for our environment! The series is going to demonstrate how your IT Department can create applications for you that are not only helping you to satisfy your day to day need for office task automation, but also are state of the art in technology, have superior performance, lower operating cost and a minimal impact on our environment.

Cost = Environmental Impact

The cost of an application is usually measured in $ - initially to create or buy the application or in the longer term to run the application. This notion of cost measurement was a good metric for the last decades of computing while energy was regarded as abundantly available and environmental cost as non -existent. In today’s world where energy is becoming scarce and more expensive, and global warming a widely accepted and feared phenomenon this is changing. Companies start to think about their energy acquisition cost and their impact on the environment. Doing good things for the environment is no longer just a tax deductable charity. It can become a real cost advantage and sales driver.

While the economic cost of an application is measured in monetary terms (for simplicity we will us $), the environmental cost is measured in pounds of CO2. Work performed is traditionally measured in terms of computation per specific time. Unfortunately this measure is not very useful for us as the type of architecture and hardware chosen for the application can make a significant difference in this measure while still providing similar output. And even more important, computations per time slot is not directly related to the energy consumed or waste generated. So we are going to take a more basic measure of work performed: Watts consumed.

Watts consumed is an ideal measure for the problem at hand as it correlates to $ as well as to pounds of CO2. For the sake of simplicity we will assume that the electricity provider is charging $0.1 per kWh of work performed. 1 kWh of energy production produces on average (depending on what kind of energy source you use) about 2.3 pounds of CO2.

So, let’s look at what this means to computing. First of all we will need an Energy Usage Profile (EUP) to estimate the power usage of a server. The EUP is a fundamental tool in helping us to define energy efficient architectures and to find out which part of our application needs more focus.

Dell and Principal Technologies created a study that tells us what the EUPs of a typical server, in this case a Dell PowerEdge M600 blade server [1]. This server consumes approximately 383.75 Watts of power while idle. This means that without doing any useful calculation this server already needs almost 400 Watts just to be waiting for an activity to perform. The theoretical maximum power usage of this server is at 600 Watts with an observed average power usage of about 454.39 Watts. Note, how low the amount of energy is that’s actually used for useful computational task in comparison to the energy needed to “keep the lights on”.

The breakdown in table 1 shows the details for the components in this server, which will be useful for us later.

Table 1: Sample Power Usage in Watts

Component

Idle

Average

Maximum

CPU

40.8

130

HDD

14.35

17

DIMM1

3

3

DIMM2

3

3

Video

18.3

25.6

NIC

4.95

4.95

CD/DVD

2

18

Other electrical components

~297.35

~398.45

Total

383.75

454.39

600

The Cost of Power

Let’s examine what the power usage of this server means for us in cost:

If the machine is idle it consumes 383.75 Watts per hour which is 0.38375kWh. Doesn’t sound too bad, but let’s assume an operation of 24 hours a day for 365 days and then this starts to look differently.

Total power consumed per year = kWh*24*365 = 3361.65kWh. At the cost of $0.1 per kWh this is about $336.17 in power costs per year if the server is not performing any useful work. At peak power usage it would be even $525.6.

An even grimmer outlook is the CO2 generated by the energy used to run this server. It will be at least 3361.65kWH*2.3, so about roughly 7731.795 pounds of CO2 for an idle server… or 12088.8 pounds of CO2 at peak performance.

Considering that due to the last decades of IT infrastructure design principles of anticipating the optimal success for full performance and worst case in terms of hardware failure times it will be very hard to find a single server solution today. Most likely your solution will have a redundant database server, a load balanced application server and a load balanced web server. This means that even the simplest enterprise solution will run on 6 servers, which are designed to handle the application at full speed at peak times with a theoretical maximum usage by all users. With our sample server from above this would produce a base cost to just have the service up and running of about $3000 a year in energy consumption and therefore also more than 45.000 pounds of CO2 created.

Where to Save

Revisiting table 1 with the detailed statistics of power usage for individual components you can clearly see that CPU is by far the highest contributor of power usage within our list of computing components. The last table row shows power used for electrical devices that are not performing any computational work, like fans, power supply, etc. We will deal with those components in more detail later.

So, how can we start saving energy and therefore real money? Well… of course you don’t want to sacrifice computing performance when you design your system to be green. Still, there are plenty of steps that you can take to optimize the purposeful usage of your available power. The best starting point for your investigation is how much power does your system need to just be up and running, waiting for tasks. Let’s say that in the above mentioned situation half of your servers are just waiting there on standby for their counterpart to go down. Just by optimizing the deployment with virtual machines you could eliminate 2 of the hot standby machines by simply activating on demand a virtual image on a spare machine of the system that just failed. But more on infrastructure based optimizations later.

Many times CPU isn’t the bottle neck of an application, but the storage infrastructure is. So you end up powering cores that are not fully used and therefore your power usage to actual computing ratio is less than optimal. Again, virtualization, or more intelligent application design could potentially help.

Comparing the energy values in table 1 for hard drive and memory usage, you can see that memory uses just a fraction of the power that is needed to power hard drives. While the power usage of hard drives and memory is pretty constant there’s also room for optimization. Typically a hard drive will be one of the potential bottle necks of an application. Now, think for a second what would happen if you simply add more memory to a machine, which would allow you to create memory disks for your application. It removes the need to access the hard drive, boosts the performance of your application and saves energy because you can power down the hard drive once the data is loaded into the memory disk. Provided you have a lot of static read only data.

Disabling screensavers, redirecting output to a central monitoring station, remote manageable server, all of that are ways of saving energy and money on the video processing of a server. Think again, do you really need a full set of peripheral devices on all of your machines?

As you can see there are plenty of opportunities to start saving costs by rightsizing your base infrastructure for your application. Which way you can save the most energy without compromising the performance or stability of your application depends on your inherent architecture and on your goals for the applications availability. You also might re-check the user requirements for this application. Many times the requests for availability are created out of the fear for losing business and not out of a technical and environmental thought process. Maybe you have predetermined low usage times where you could save money by consolidating applications into one virtualized environment and spreading them out on more machines when high throughput times are expected.

An interesting observation in this case is that if you focus on greener architecture you automatically start also improving the application architecture. It still needs to be proven that these two things are indeed correlated, but early observations of green architecture efforts indicate that greener architecture tends to be simpler and therefore more effective and yielding better performance.

While it is often hard to determine where to start optimizing an application for performance by just looking at a code level architecture the optimization for green computing can give you good starting points for overall increase of your application performance.

Optimizing the Infrastructure

The low hanging fruit in optimizing your infrastructure to become greener and more cost effective is definitely the usage of virtualization.

In general virtualization is understood as the idea of sharing resource between different applications or solutions. In its simplest form it means that each server application is running in its own virtual machine - which is then co-hosted on a physical host. But in fact virtualization is much more than that. It’s the general principle of sharing scarce resources. While hardware designers and kernel developers very well understand the von Neumann’s idea of virtualization they are usually not as widely recognized for using virtualization as are infrastructural meanings of virtualization.

Before we tackle the usage of low level virtualization like it happens in the kernel of an operating system let us first take a look at infrastructure virtualization and how we can define a meaningful model of maturity.

Let’s take a look at the 4 levels of virtualization maturity and how those affect the greenness of your solution. See Table 2 for a list and comparison of key metrics of virtualization maturity.

Table 2 - Level of Virtualization Maturity

Virtualization Maturity

Name

Applications

Infrastructure

Location

Ownership

Level 0

Local

Dedicated

Fixed

Distributed

Internal

Level 1

Logical

Shared

Fixed

Centralized

Internal

Level 2

Data Center

Shared

Virtual

Centralized

Internal

Level 3

Cloud

SaaS, S+S

Virtual

Virtual

Virtual

Level 0 denotes no virtualization at all. Every application uses its own server and doesn’t share any resources with other applications. At this level the application demands a dedicated server/machine for every application. This is the basic level and is most ineffective in terms of resource usage. All the systems are consuming idle power, data needs to be replicated, and computing steps that are identical for all instances need to be rerun on every machine. This was mainly the case before enterprises started to use N-tier application architecture and had siloed local applications running on individual workstations.

Level I first introduces the idea of sharing resources. This is typically an N-tier application where a multitude of clients share the same database and application server. Identical work has no longer to be done on each client, but can leverage server resources. Also the need for copying data between clients is removed due to central storage at the server.

Level II is dealing with a broader level of infrastructure and is typically used within a data center. Several application servers are virtualized into virtual machine images and those are distributed and orchestrated on physical machines with the ability to move those images around on the available hardware to optimize application performance and resource usage. Although this level of virtualization maturity is rapidly gaining acceptance among IT operators it still poses the problem of being energetically wasteful. Every single virtual image is instantiating a separate operating system and all the services that an operating system provides even if they are not used by the application. Therefore a certain base level in energy consumption is needed even if a virtual machine is idle. What is missing is an inherent level of multi tenancy where each virtualized image can share the base operating services without the need to replicate them.

Level III is the most mature level and tackles the problem of Level II. Applications are designed in a multi-tenant model with a large amount of tenants (users) running in the same data center and therefore sharing the inherent operating service infrastructure without the need to replicated those services for each tenant. Furthermore data centers of this scale are usually run by highly specialized companies with better abilities to optimize power usage through better equipment and domain specific expertise. Also companies that are running large data centers tend to be able to buy hardware and energy at prices that any other enterprise can just dream of. Through optimal usage of hardware, software and energy resources the average cost per computing activity is drastically reduced.

Next week in Part 2, Peter digs deeper into the optimization of each level, the principle of sharing and throws in a word of caution too.