> 2) The profiling timer runs completely in assembler and does> not save/restore state, just records the profiling info> then returns from the trap, no c-code, smaller cache> footprint and less cycles burned, thus the changed timing> timing of the kernel caused by the profiling tick itself> is reduced as much as possible

some hardware sucks ... for example, on my pentium system, alone the costof getting to the first instruction of the IRQ handler costs ... 8microseconds :(( I guess it's due to the legacy PIC chip still sittingstill on the ISA bus ...

really, 8 microseconds, from the point where CPU execution stops, to thepoint where the interrupt vector shows. It's 800 wasted cycles. PChardware sucks.

i will measure how expensive the SMP IPI interrupts are, from the hardwarepoint of view. Maybe it makes sense to bombard one CPU with cross-CPUinterrupts, generating profiling irqs. They should be much cheaper,theoretically, and if you control the bombardment, they can be ratherrandom and irrational compared to the timer IRQ on the first CPU.

curious how the typical hardware irq latency numbers look like on theSparc :)