Learn with our tutorials and training

developerWorks provides tutorials, articles and other
technical resources to help you grow your development skills
on a wide variety of topics and products. Learn about a specific
product or take a course and get certified. So, what do you want to learn
about?

Featured products

Featured destinations

Find a community and connect

Learn from the experts and share with other developers in one of our
dev centers. Ask questions and get answers with dW answers. Search for local events
in your area. All in developerWorks communities.

This content is part # of # in the series: Reduce Linux power consumption, Part 3

This content is part of the series:Reduce Linux power consumption, Part 3

Stay tuned for additional content in this series.

About this series

In this series, learn how to tune your Linux-based IBM System x server for power
efficiency. You'll learn about the in-kernel governors and their settings and
how to use them; you'll also see the effects of the tuned governors on a power
performance and e-commerce workload. The examples are based on a System x server
running Red Hat Enterprise Linux version 5.2 (RHEL 5.2), but the same guidelines
apply to any of the 2.6.x kernels, as well as any processor type that supports
frequency scaling.

Part 1 introduces the components and
concepts you'll need to tune your system for power efficiency, including the
Linux CPUfreq subsystem, C and P states, and the five in-kernel governors.

Part 2 gives more details on the
general settings of the Linux CPUfreq subsystem and the five in-kernel
governors—performance, powersave,
userspace, ondemand, and conservative—and
their settings.

Part 3 compares the performance of the five in-kernel governors in both a tuned
and an untuned state to show you what results you can achieve by power tuning
your system.

Workloads and governor effects

Power efficiency is an important consideration for anyone concerned with business
costs or environmental issues. In this final article in the series, let's look at
the difference in power efficiency (in real numbers and charts) that you get from
tuning the Linux CPUfreq subsystem and in-kernel governors to change the processor's
operating frequency without having a major impact on performance.

In Part 2, you saw how to use and tune the
governors, so now you'll see some governor effects. I use two popular workloads to
compare performance and power consumption and show how a tuned governor can provide
power savings without sacrificing performance:

A workload from the SPECpower_ssj2008 benchmark that evaluates both power and
performance

A workload from an e-commerce shopping application that gathers many statistics
during a simulated online shopping session, including latency times and the
number of requests per second

These comparisons were made on an IBM System x® 3650 running Red Hat Enterprise
Linux 5.2.

The following results are from the SPECpower_ssj2008 benchmark that evaluates both
power and performance. To find out more about this benchmark or see the latest
official benchmark results, see the SPEC Web site (Related topics for a link). Note that these results are not tuned for
optimal performance and should not be considered official benchmark results for the
system, but rather results obtained for research purposes.

SPECpower_ssj2008 uses a Java™ benchmark to get a performance score in the
unit ssj_ops (ssj operations) and runs the benchmark at loads from 100 percent down
to idle. The higher this score, the more the system can compute.

SPECpower_ssj2008 also measures power in Watts and calculates a performance-to-power
ratio at each of the loads. The higher the ratio, the better the system's
performance-to-power efficiency.

Default governor comparison

Figure 1 compares the effects of the five in-kernel governors, all running with
their default settings. The tunables sched_mc_power_savings and
sched_smt_power_savings were off and the CPU frequency daemon
cpuspeed was running in conjunction with the userspace governor.

Figure 1. Score and power consumption for
default

The dotted lines show the score in ssj_ops, a SPECpower_ssj2008
performance metric. As you can see the only governor that causes a major drop in
performance is the powersave governor. This is of course because the powersave
governor statically sets the processor frequency to the lowest available to save as
much power as possible.

The solid lines show the power consumption. Again the powersave governor uses less
power than the others, but at the expense of performance. Also, you can see the
difference in idle power between the governors. The performance governor always runs
at the highest frequency and therefore has an increase of around 10 Watts at idle
compared to the other governors. The userspace governor, with the
cpuspeed daemon, seems to be the best of the default governors in
providing a power savings without hurting performance. We can confirm this by
comparing the performance-to-power ratios for each run in Figure 2.

Figure 2. Performance-to-power ration for
default

Join the green groups on My developerWorks

Discuss topics and share resources about energy, efficiency, and the environment
on the Green computing group on My
developerWorks.

The performance-to-power ratio is a metric calculated by SPECpower_ssj2008 to
highlight how power-efficient a system is by comparing the score received to the
amount of power used to achieve that score, so the higher the ratio the better.

As you can see, the userspace governor, in conjunction with the
cpuspeed daemon, has a better performance-to-power ratio than the
others for most of the loads when the governors are running their default
configurations for this setup; therefore, the userspace governor is more power
efficient.

Tuning

As discussed earlier in Part 2, there are some optional tuning
parameters for the ondemand and conservative governors. Here we will discuss how
changing the utilization thresholds can affect the governor's power efficiency.

Remember tunable schedulers

sched_mc_power_savings is for scheduling processes on cores.

sched_smt_power_savings is for scheduling processes on hyperthreads
on a core.

Ondemand The ondemand up_threshold is set to 80 by
default, meaning that once the CPU utilization reaches above 80 percent, the
governor will increase the frequency. Here I'll show how you can tune the governor
to be more power efficient simply by changing the up_threshold to 98.

Figure 3 compares the ondemand governor's effectiveness running the default
configuration, an up_threshold of 80 versus the tuned ondemand
governor, running with an up_threshold of 98. The tunables
sched_mc_power_savings and sched_smt_power_savings
were off during these runs.

Figure 3. Score and power consumption for
ondemand

As you can see by the dotted lines, both the default and tuned ondemand governor
achieved a very similar score, so changing the up_threshold does not
show any performance impact. The solid lines, which show power consumption, do show
a slight difference. As you can see, raising the up_threshold to 98
results in slightly lower power consumption than using the default threshold.

Figure 4. Performance-to-power ratio for
ondemand

Here you can see that for almost every load, the tuned ondemand governor with an
upwards utilization threshold of 98 is slightly more power efficient than the
default ondemand governor.

Conservative The conservative governor has two threshold values
that can be tuned:

First, the up_threshold is set to 80 by default, meaning that once
the processor utilization reaches above 80 percent, the governor increases the
frequency.

Also, there is a down_threshold, which is set to 20 by default.
This means that once the governor finds the processors to be less than 20
percent utilized, it will start stepping down the frequency to save power.

I'll demonstrate how you can tune the conservative governor to be more power
efficient simply by changing the up_threshold to 98 and the
down_threshold to 95. This is fairly aggressive tuning for the
governor, but I'll show that the aggressively tuned conservative governor is more
power efficient.

Figure 5 compares the conservative governor's effectiveness running the default
configuration, an up_threshold of 80 and a down_threshold
of 20, versus the tuned conservative governor running with an
up_threshold of 98 and a down_threshold of 95. The
tunables sched_mc_power_savings and
sched_smt_power_savings were off during these runs.

Figure 5. Score and power consumption for
conservative

Again, the dotted lines show there is no performance impact when using an
aggressively tuned governor. The solid lines show the difference in power
consumption between the default and tuned governor; it is very clear that the tuned
governor pulls much less power at the middle loads, up to about 40 watts less at the
50 percent load. This is a significant power savings. You can confirm these
observations by comparing the performance-to-power ratios in Figure 6.

Figure 6. Performance-to-power ratio for
conservative

The ratios show that the tuned conservative governor increases its power efficiency
over the default conservative governor for the 30 through 90 percent loads.

Tuned governors comparison

In this section, I'll compare the tuned ondemand and conservative governors to the
other three governors. Figure 7 compares all five governors with ondemand and
conservative threshold tuning. The tunables sched_mc_power_savings and
sched_smt_power_savings were off, and the CPU frequency daemon
cpuspeed was running in conjunction with the userspace governor.

Figure 7. Score and power consumption for tuned
governors

Here you can see again what a big performance hit the powersave governor takes since
it only runs at the lowest possible frequency, although it does consume much less
power than the others. (We'll look at the powersave governor's power efficiency
compared to the others in the next graph.) The other four governors achieved a
similar score regardless of their tuning. Again the performance governor only runs
the highest available frequency, and you can see how great a difference this makes
in power consumption by comparing the solid lines. The userspace governor, running
with the cpuspeed daemon, and the tuned conservative governor pull the
least power after the powersave governor. The userspace governor appears to consume
slightly less power than the tuned conservative governor around the 30 to 50 percent
loads, and the tuned conservative governor pulls less power for the loads above 50
percent. We can see which one has a better power efficiency by comparing their
performance-to-power ratios in Figure 8.

Figure 8. Performance-to-power ratio for tuned
governors

From this graph you can see that the power efficiency between the tuned conservative
governor and the userspace governor running cpuspeed is very similar.
The final SPECpower_ssj2008 score, not shown here, indicates that the tuned
conservative governor has the best overall power efficiency, but only by a very
small margin.

sched_mc_power_savings comparison

As I discussed earlier, the sched_mc_power_savings tunable attempts to
consolidate processes to as few cores as possible in order to save power. Figures 9
and 10 show a comparison of CPU utilization for a run with
sched_mc_power_savings on (1) and off (0), running with the default
conservative governor. The following comparisons show the utilization for each
processor at the 10 percent load, so the system is on average 10 percent utilized.

Figure 9. sched_mc_power_savings off

Figure 10. sched_mc_power_savings on

You can clearly see the difference in the two graphs. The first graph (Figure 9)
with sched_mc_power_savings off shows that four of the processors are
running at about 15 percent, and the other four are running at about 5 percent
utilization. The second graph (Figure 10) with sched_mc_power_savings
on shows that the load was consolidated onto four processors, now at about 20
percent, and the other four are idle. Using this tunable in conjunction with an
in-kernel CPUfreq governor can reduce power consumption since the consolidation
allows some of the processors to be idle and therefore able to run at a lower
frequency.

sched_smt_power_savings comparison

Like sched_mc_power_savings, the sched_smt_power_savings
tunable attempts to consolidate hyperthreads onto the fewest number of CPUs in order
to save power. Figures 11 and 12 show a comparison of processor utilization for a
run with sched_smt_power_savings on (1) and off (0), running with the
default conservative governor on a system that supports hyperthreading. The
following comparisons are showing the utilization for each processor at the 10
percent load, so the system is, on average, 10 percent utilized.

Figure 11. sched_smt_power_savings off

Figure 12. sched_smt_power_savings on

Again you can see that the load was consolidated when the setting was on. If the
CPUs that are idling or close to idling can use CPUfreq governors to lower the
frequency and/or idle C states in conjunction with this type of scheduling, power
savings may be possible.

An e-commerce workload

In this section, I'll compare the governor effects on another type of workload. The
following results are from an e-commerce shopping application that gathers many
statistics during a simulated online shopping session, including latency times and
the number of requests per second. This application uses an Apache front end, a PHP
implementation, and a MySQL database to create a usable shopping site. Note that
these results are not tuned for optimal performance and should not be considered
official results for the system. We'll compare the effects on the workload at
various utilization loads.

Default governors comparison

The following graphs compare the effects of two tunable in-kernel governors,
conservative and ondemand, and the performance governor as a baseline comparison.
All governors are running with their default settings, and the tunables
sched_mc_power_savings and sched_smt_power_savings
were off during these runs.

The Figure 13 series shows the statistics for an online shopping session with 500
clients total. The system under test is 8-12 percent utilized on average when
running 500 clients.

Figure 13a. Performance in requests per
second

Figure 13b. Latency in milliseconds

Figure 13c. Average power in watts

Figure 13d. Performance per watt

Figure 13a shows the performance of the shopping session in requests per second. You
can see that all three governors have almost exactly the same number of requests per
second.

There is a slight difference in average latency as you can see in Figure 13b. The
conservative governor has a latency of almost 7ms more than the performance
governor, but for an application such as an online shopping cart, most users will
not notice a few extra milliseconds, so this difference may be considered
negligible.

Figure 13c shows the average power consumption. You can see that the conservative
governor saves about 20W over the performance governor; that is, there is no
processor frequency scaling and the ondemand governor saves about 15W on average.

Figure 13d shows the performance per watt by taking the number of requests per
second divided by the average power consumed. The governors' similar performance and
the power savings by the two dynamic governors translates into a higher performance
per watt. The conservative governor is the most power-efficient for a load of 8-12
percent utilization, closely followed by the ondemand governor.

Next we'll compare the performance per watt for each of the three default governors
for some larger loads to see whether the default conservative governor still has
better power efficiency than the other two default governors. Figure 14 shows a load
of 1,000 clients, which results in an average utilization of around 20-25 percent.

Figure 14. Default governor comparison for 1,000
clients

From this chart you can see that the conservative governor is the most
power-efficient governor for this load as well. For this run, the conservative
governor saved about 25W more than the performance governor while still serving
almost exactly the same number of requests per second. The conservative governor's
average request latency was about 5ms slower than the other two governors for this
load.

Last, let's look at the performance per watt for a load of 2,000 clients in Figure
15, which pushes the system under test to around 45-60 percent utilization on
average.

Figure 15. Default governor comparison for 2,000
clients

For this load, the default ondemand governor had a slightly better performance per
watt. The ondemand and conservative governors both saved about 15W here, but the
default conservative governor took a performance hit since it completed about 8
fewer requests per second than the others and had a latency of about 0.15 seconds
more than the performance governor. The ondemand governor won here since it achieved
virtually the same number of requests per second with a latency of only 50ms more
than the performance governor, which of course represents what the system would
achieve without any processor frequency scaling at all.

Tuned governors comparison

Now we'll compare how the tuned ondemand and conservative governors behave with this
workload. Again, the tuning of the governors was achieved by changing the
utilization thresholds. The tuned conservative governor had its
up_threshold set to 98 and the down_threshold set to
95. The tuned ondemand governor was running with an up_threshold of 98
as well. We'll look at the effects of the tuned governor on a heavier load of 2,000
clients (45-60 percent utilization on average) in the Figure 16 series.

Lighter loads do not show much of a difference, because the tuned governors act the
same as the default governors for all loads under 20 percent utilization since that
is the default down_threshold. The tunables
sched_mc_power_savings and sched_smt_power_savings
were off for these runs.

Figure 16a. Performance in requests per
second

Figure 16b. Latency in milliseconds

Figure 16c. Average power in watts

Figure 16d. Performance per watt

Figures 16a and 16b show that the tuned conservative governor took a slight
performance hit of about 13 fewer requests per second and a higher latency of about
0.28 seconds more than the performance governor; however, you can see from Figure
16c that the tuned conservative governor achieved a significant power savings of
about 55W over no processor scaling. Even with the slight performance hit, the tuned
conservative governor was the most power-efficient by far.

Conclusion

In this 3-part series, I've shown that in most cases, a tuned conservative governor
with an up_threshold of 98 and a down_threshold of 95
achieves the best performance-to-power efficiency. In some cases, this governor can
have a slight effect on performance.

You must decide whether the possible effect on performance is worth achieving
potentially significant power savings. As I discussed, there are many tunables for
the dynamic in-kernel governors that you can adjust to affect the performance of the
governor, which in turn can affect the performance of the workload running.

As always, there is a tradeoff between power savings and performance, but I hope
I've shown you how to reduce the effects on performance to a negligible degree while
getting a better power efficiency from the system.

SPECpower_ssj2008 is the first industry-standard SPEC benchmark that
evaluates the power and performance characteristics of volume server class
computers. The initial benchmark addresses the performance of server-side Java,
and additional workloads are planned. You can see the latest results.