Hi, David
Today I did a quick test against your patch set on my MIPS32 Malta board.
After fixing a small compiling issue (see my comment for patch #5), I
successfully built the kernel based on my previous mainline-sync changes.
And when doing the test, I was using the an previously compiled 'perf'
tool, because the latest perf tool needs arch specific DWARF register
mapping definitions (and currently we have not yet submitted this patch).
And here's the test result:
# When this patch set is built in, the simple 'perf stat' command takes
very long time (182 seconds for the ls command). See following:
-sh-4.0# perf stat -e cycles -e instructions ls /
bin dev home lost+found opt root share tmp usr
boot etc lib mnt proc sbin sys trans var
Performance counter stats for 'ls /':
2825998290 cycles
2148970283 instructions # 0.760 IPC
181.901999444 seconds time elapsed
# When this patch is NOT used, namely, only the mainline-sync changes are
built in, the time looks reasonable:
-sh-4.0# perf stat -e cycles -e instructions ls /
bin dev home lost+found opt root share tmp usr
boot etc lib mnt proc sbin sys trans var
Performance counter stats for 'ls /':
2051461 cycles
1041512 instructions # 0.508 IPC
0.046426513 seconds time elapsed
I noticed that you changed quite a lot of original logics in MIPS
Perf-events, including the deletion of the 'msbs' member in the struct
cpu_hw_events. Honestly speaking, I have not yet taken a careful look into
the patch set to find out how you deal with the MIPS specific 0x80000000
counter overflow (certainly, the value is for MIPS32), instead of
0xffffffff. But maybe this code logic could be related to the test result.

Can you test the new version of my patches I just posted? I think I may
have fixed this issue, but I cannot actually test 32-bit counters.

I think I was initializing the counters to the wrong value in the 32-bit
case. This would have caused an almost unending stream of counter
overflow interrupts, thus slowing things down