Today I tried some benchmarking with my animation algo and came about something very weird.

I first render 1 character, then increase the amount to about 700 characters and go back to 1 character. But see whats happening; suddenly the performance goes down from 110MVerts/sec to about half, 65MVerts/s. I didnt change anything in my program at this time...

I found its glFinish() where all the time is consumed, and its caused by the asynchronous glReadPixels() to copy FBO->PBO.

Does anybody experienced something similar ?
Is it a well known and already fixed driverbug ?

I found it only happens if a large amount of triangles is pushed
( like a scene with more than 3M vertices )

The animation is done via PBO (render FBO, copy to VBO),
Hardware: NVidia6600GT, WinXP, 81.98 drivers.
On the newest drivers, I cant get glReadPixels doing
its job in general, so I cannot say whether its fixed there.

Cams18

11-12-2006, 08:28 PM

>> I found it only happens if a large amount of >> triangles is pushed ( like a scene with more >> than 3M vertices )

That may explain why, do you really need to push them on the stack? Maybe you need to divide your task in two or three pieces to improve the performance.

Sv3nni

11-13-2006, 03:55 AM

Actually that with 3M vertices just has been a test; its not required for my project. But still was surprised how this could happen - I thought a glFinish is waiting until every background-task is finished, isn't ?
The triangles are not pushed at once. There are 43 different vertex buffers (43 chars) which are repeatedly rendered to reach 688 characters. The glReadPixels is called once for each character.

A unconventional solution could be to buy a NV8800. Then I can switch to texture_buffer_objects and don't need the readpixels anymore :-)