Description of problem:
Following error is given on an ia64 box:
------------------------------------------------------------------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec]
Conflicting CPU frequency values detected: 1466.000000 != 1667.000000
65536 1000 0.00 0.00
------------------------------------------------------------------
Note that it only happens when ia64 is on the client side. When it's used as
server, things work fine.
Version-Release number of selected component (if applicable):
RHEL5.2-Server-20080212.0 tree
How reproducible:
Everytime.
Steps to Reproduce:
1. Run ib_send_bw utility where the client is an ia64 node
2.
3.
Actual results:
Expected results:
Additional info:
This is a regression

OK, this isn't an openib issue, it's a kernel issue. Basically, the output of
/proc/cpuinfo on the 16 way Itamium this bug comes from is not, in any way,
constant. The nominal cpu MHz is 1466, but by running this command:
while true; do grep "cpu MHz" /proc/cpuinfo | grep 1667; done
you do in fact get an occasional jump up from 1466. What's more, the bogomips
in this file can be totally horked. This is the 16 bogomips values from a
single cat of /proc/cpuinfo:
BogoMIPS : 1658.88
BogoMIPS : 1671.16
BogoMIPS : 16.35
BogoMIPS : 16.35
BogoMIPS : 3301.37
BogoMIPS : 3309.56
BogoMIPS : 3309.56
BogoMIPS : 3309.56
BogoMIPS : 1671.16
BogoMIPS : 1683.45
BogoMIPS : 1662.97
BogoMIPS : 1667.07
BogoMIPS : 3309.56
BogoMIPS : 3309.56
BogoMIPS : 3309.56
BogoMIPS : 3301.37
In any case, it really looks like the contents of /proc/cpuinfo on ia64 isn't
reliable/trustable, and the perftest program is noticing that and refusing to
make bandwidth numbers from an inconsistent divisor (the numbers don't have much
value if you don't have a reference between wall time and cpu time).

Changing severity to low. This Intel whitebox system appears to be the only
system with this issue. In the past, Intel whiteboxes have shown other strange
behavior. I suspect that an SCI is being issued during system init and that is
causing problems with the bogomips calibration.
P.

(In reply to comment #6)
> Gurhan -- is this system connected to a serial console?
>
> P.
Prarit,
No, not yet:(
Right now the box is being used by mjenner, you can grab me or him to show you
in the lab where the box is if you are in the office.

Ok, so trying this on another ia64 box, I can get the client program working,
however it also prints out this warning message:
Warning: measured timestamp frequency 399.187 differs from nominal 1594 MHz
Prarit, I'll let you decide what to do about it since I don't know what could be
causing it or if it's a bug.
This was, by the way, on hp-sapphire-02.rhts.boston.redhat.com box, borrowed
from dchapman .

why does the tool use the bogomips in /proc/cpuinfo as a definitive information
to decide run or not run? Instead of using the boot-time data, the tool should
calibrate the value by itself to reflect the most recent status.

The tool doesn't use the bogomips value. I only posted the bogomips value to
demonstrate how screwed up the values in proc/cpuinfo are on the machine in
question. The tool uses the CPU MHz value only, and that value varies on this
particular machine.

It is also an iffy decision to use CPU MHz value for your test program, because
With the DBS and cpuspeed enabled, the proc CPU MHZ value is to be consistent
with current CPU p-state which will be adjusted from time to time based on
workload of the time.

I think it is not proper to use CPU MHZ of proc/cpuinfo in the ib_send_bw test
program based on it's changeability... The tool should calibrate the CPU MHZ
value by itself.
After bootup, if you still can calibrate the bad bogomips value as comment# 1,
then that is a real problem , and we need to worry about it.

In response to comment #13, the program *is* generating its own CPU MHz rating.
That's the whole point of the message in comment #8. Contrary to Prarit's
comment #12, the program isn't comparing an itc clock to a cpu clock, it's
comparing the CPU MHz as generated using this method:
/*
Use linear regression to calculate cycles per microsecond.
http://en.wikipedia.org/wiki/Linear_regression#Parameter_estimation
*/
versus the value reported in /proc/cpuinfo and is then reporting the variance
whenever the variance is > 1%.
Basically, the program has two checks it performs on CPU MHz. The first is that
it reads all of the CPU MHz values from /proc/cpuinfo. It makes sure that all
CPUs report the same speed. If, in a single reading of /proc/cpuinfo, some CPUs
have one speed and others have another, then it reports the message that
originally caused this bug to be opened. Once the reading of /proc/cpuinfo has
passed the "all cpus are identical speed" test, then the code generates its own
value of CPU MHz based upon the linear regression technique and compares that to
the value report in the /proc/cpuinfo file and if the difference between the two
is greater than 1%, you get the second message.
Now, I should point out that these same tests have never produced any problems
anywhere other than ia64, so I'm pretty sure the linear regression method is
working properly (at least on i686/x86_64 and probably ppc64 too). That would
seem to indicate that, contrary to Luming's comment #11, the values in
/proc/cpuinfo are *not* in fact being kept consistent with the current processor
state.
All that makes me think that this is still a kernel problem, not a problem with
the ib test code.

OK, it looks like Prarit's comment #12 was correct. My mistake on that. While
looking through the header file get_clock.h from the source code, it appears
that the method by which the code gets the cycle count on ia64 is to access
ar.itc, which I can only assume is the itc clock that Prarit referred to.
I'm attaching get_clock.c and get_clock.h to this report. These contain the
functions the perftest programs use to calibrate/check the cpu cycle times.
Now, if the itc clock isn't always the same as the cpu clock, is it true that
they are always a set multiple of each other, and if so what are the possible
multiples? I could write the code so that on ia64 is checks alternative
multiples before declaring the values bad.

Created attachment 301635[details]
possibly fixed get_clock.c
This version of get_clock.c attempts to determine if a multiple is in use
between the itc and cpu clocks, and if so adjusts things accordingly.

OK, I've built a new version of perftest with the modified routine to check for
a clock multiple of the itc. It will still fail if it detects different cpu
speeds, and there's not much I think we should do about that. I would be more
include to tell people to disable cpu speed scaling during performance runs.

Thanks Gurhan. Added the following note to RHEl5.2 release notes updates:
<quote>
(ia64) Running perftest will fail if different CPU speeds are detected. As such,
you should disable CPU speed scaling before running perftest
</quote>
please advise if any further revisions are required. thanks!

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.
http://rhn.redhat.com/errata/RHBA-2008-0432.html

Release note added. If any revisions are required, please set the
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Note

You need to
log in
before you can comment on or make changes to this bug.