About this blog

AIXpert Blog is about the AIX operating system from IBM running on POWER based machines called Power Systems and software related to it like IBM Systems Director, PowerVM for virtualisation and PowerSC for security plus performance monitoring and nmon

Links

Tags

Recent tweets

Local, Near & Far Memory part 4 - Aggressive Intelligent Threads

This is a follow on from yesterdays blog due to Chris Gibson highlighting a question/concern from one of his customers in Australia.They were comparing Power6 and Power7 and the utilisation numbers from the SMT Logical Processors and the graphs look different. I looked at some nmon data (what else!) and yes they are looking different and then I ran a simple generated workload test and duplicated the graphs. Below I then explain them - once again these are my personal observations rather than an official AIX developers insider statement.

I ran a workload: ncpu -p8 -z 25 -h1 -s 900

This reads 8 processes, sleeping 25% of the time but pause for 1 second after each 1 second of CPU time and then stop after 900 seconds.

This gives us a bunch of programs running and starting and stopping fairly randomly. And a sanity check so this artificial workload will stop - if I forget to kill the program!

The POWER7 virtual machine has twice the logical CPUs as expected as it is running SMT=4 instead on SMT=2

The POWER6 graphs show a fairly even split of work between logical CPU CPU001 and CPU002 (these two combined make up the first POWER6 physical CPU-core) - this is because it is in SMT=2 mode and there is no real favourite between the SMT threads. One thread is as good as the other.

The POWER7 graph show that for logical CPUs CPU001, CPU002, CPU002, CPU003 (these four combined make up the first POWER7 physical CPU-core), that the first logical CPU is very much more favoured than the second and third and fourth logical CPUs are not used much at all.

The Power7 behaviour is Intelligent SMT Threading mode switching in action. It knows there are not enough processes running (low run queue) to use SMT=4 so it has switched to SMT=2 and moves the processes to the first two logical CPUs. Then it notices that there is not even enough processes running for a fair chunk of the time to need SMT=2 so it switches to SMT=1 and moves the processes to the first logical CPU. This means the single running progress is getting the internal resources for the whole physical CPU-core with no contention from other threads and so gets a speed boost.

Both POWER6 and POWER7 where using roughly 2.5 physical CPUs but it is clear with POWER7 that we could remove a physical CPU-core or even two physical CPU-cores as you can clearly see there are plenty of unused SMT logical CPUs to run work on.

Once more for the record for shared CPUs: It is impossible to average the logical CPU utilisation stats to work out how busy are your physical CPU-cores because the logical CPUs are con-currently executing on the shared internal compute units of the physical CPU-core. You can't find the 2.5 in the graphs above.

The customer question was: Is there something wrong with POWER7?

The answer is: No. Actually, there is something very right with POWER7!

I hope you can continue this type of hard work to this site in
future also..Because this blog is really very informative and it
helps me lot. <a href="http://www.mtdubai.com/" rel="nofollow">Dubai Safari</a> |
<a href="http://www.sweetode.com/" rel="nofollow">Gifts to Pakistan</a>