> On Sunday 30 November 2008 04:52:41 Avi Kivity wrote:> > Alexander van Heukelum wrote:> > > I now did the benchmarks for the same -rc6 with hpa's 4-byte stubs> > > too. Same machine. It's significantly better than the other two> > > options in terms of speed. It takes about 7% less cpu to handle> > > the interrupts. (0.64% cpu instead of 0.69%.) I have to run now,> > > I'll let interpreting the histogram to someone else ;).> >> > This is noise. 0.05% cpu on a 1GHz machine servicing 1000 interrupt/sec> > boils down to 500 cycles/interrupt. These changes shouldn't amount to> > so much (and I doubt you have 1000 interrupts/sec with a single disk)..> > Sure, but smallest cache wins. Which is why I thought hpa chose the 3 byte > option.> > > I'm sorry, but the whole effort is misguided, in my opinion.> > Respectfully disagree. I wouldn't do it, but it warms my heart that > others are. It's are not subtractive from other optimization efforts.

Yeah, the efforts from Alexander, Peter and Cyrill are fantastic.

The most positive effects of it isnt just the optimizations and code compression. What is important is the fact that this piece of code (entry_*.S) has not gotten systematic attention for a decade, and has become a rather crufty patchwork of hit-and-run changes. It has become very hard to review and it has become a reoccuring source of nasty bugs.

It is now being reworked and re-measured. It does not matter how small and hard to measure the gain is in one specific case - we are mainly concerned about not creating measurable _harm_, and we are after the cleanups and the increase in maintainability. Once the code is cleaner, improvements are much easier to do and bugs are much easier to fix.