>>> On Fri, 29 Oct 2004, linux-os wrote:>>>> Linus, there is no way in hell that you are going to move>> a value from memory into a register (pop ecx) faster than>> you are going to do anything to the stack-pointer or>> any other register.>> Sorry, but you're wrong.

I am not wrong.

I don't understand anything about your theoretical CPUwith the magic stack engine. Anything I can get myhands on functions exactly as I described and exactlyas would be expected. We work with real hardware hereand I have to test it as part of my job.

And, FYI, I spend all my working time trying to get thelast iota of performance out of ix86 CPUS. Since I canonly read publicly available documentation, I haveto test code in actual operation.

The attached file shows that the Intel Pentium 4 runsexactly as I described. Further, there is no difference inthe CPU clocks used when adding a constant to the stack-pointer or using LEA.

It also shows that poping stack-data into the same registertwice, as you suggested, takes the same time as using adifferent register.

The code uses a separate assembly-language file so thatthe 'C' compiler can't optimize-away what I am measuring.It also saves and uses the shortest number of CPU cyclesso the code doesn't have to execute with the interruptsOFF to get a stable reading.

>> Learn about modern CPU's some day, and realize that cached accesses are> fast, and pipeline stalls are relatively much more expensive.>

That's what I do, and that's what I teach.

> Now, if it was uncached, you'd have a point.>> Also think about why>> call xxx> jmp yy>> is often much faster than>> push $yy> jmp xxx>> and other small interesting facts about how CPU's actually work these> days.>> Linus>

Cheers,Dick JohnsonPenguin : Linux version 2.6.9 on an i686 machine (5537.79 BogoMips). Notice : All mail here is now cached for review by John Ashcroft. 98.36% of all statistics are fiction.[unhandled content-type:application/x-gzip]