Recommended Posts

Recently I've been revamping the ol' engine (basically a rewrite) and have come upon my vertex cache management system. Currently the system is very suboptimal--far too many small VBOs.
I've been searching through the forums (and nVidia/ATI PDFs) about any insight as to an optimal system. Right now I'm thinking of this:

Maintaining one list of VBO per usage type (dynamic, static, stream) in ascending order of slot size.

Each VBO would be subdivided into slots of fixed length; each slot has an unique ID.

Each slot maintains its offset into the VBO, along with a pointer to the GeometryChunk which is currently using it.

When searching for an open slot traverse the appropriate list and find a slot that is of a close enough size, store a pointer to the slot info structure in a map with it's UID as the key, and mark it as used (obviously).
I think a system such as this could work well. I am, however, uncertain about several elements. For example, what is the optimal size for a VBO? This post is sorta vague, but I'm just curious as to everyone's thoughts on an efficient vertex cache system.

0

Share this post

Link to post

Share on other sites

Original post by mhamlinRecently I've been revamping the ol' engine (basically a rewrite) and have come upon my vertex cache management system. Currently the system is very suboptimal--far too many small VBOs.

I've been searching through the forums (and nVidia/ATI PDFs) about any insight as to an optimal system. Right now I'm thinking of this:

Maintaining one list of VBO per usage type (dynamic, static, stream) in ascending order of slot size.

Each VBO would be subdivided into slots of fixed length; each slot has an unique ID.

Each slot maintains its offset into the VBO, along with a pointer to the GeometryChunk which is currently using it.

When searching for an open slot traverse the appropriate list and find a slot that is of a close enough size, store a pointer to the slot info structure in a map with it's UID as the key, and mark it as used (obviously).

I think a system such as this could work well. I am, however, uncertain about several elements. For example, what is the optimal size for a VBO? This post is sorta vague, but I'm just curious as to everyone's thoughts on an efficient vertex cache system.

That system might work well if you never need to evict data from the cache or update data (LOD changes). In my case I have a limited cache (24MB) and as I move across my level/scene new geometry becomes visible which needs to be cached (plus LOD updates) and there is old geometry in the cache which can be dumped. The fixed number of slots per size aren't good here either.

The types of VBOs (dynamic, static, streaming) are merely hints to the driver how you are going to use them and dividing them up probably isn't going to affect performance much if at all. In my case its all dynamic.

As for VBO sizes, there is no optimal size. I believe over 6 or 8 MB they become more ineffecient (not sure about the low range). Just play with them until you find sizes/counts that work well[smile].

HTH

0

Share this post

Link to post

Share on other sites

Original post by mhamlinRecently I've been revamping the ol' engine (basically a rewrite) and have come upon my vertex cache management system. Currently the system is very suboptimal--far too many small VBOs.

I've been searching through the forums (and nVidia/ATI PDFs) about any insight as to an optimal system. Right now I'm thinking of this:

Maintaining one list of VBO per usage type (dynamic, static, stream) in ascending order of slot size.

Each VBO would be subdivided into slots of fixed length; each slot has an unique ID.

Each slot maintains its offset into the VBO, along with a pointer to the GeometryChunk which is currently using it.

When searching for an open slot traverse the appropriate list and find a slot that is of a close enough size, store a pointer to the slot info structure in a map with it's UID as the key, and mark it as used (obviously).

I think a system such as this could work well. I am, however, uncertain about several elements. For example, what is the optimal size for a VBO? This post is sorta vague, but I'm just curious as to everyone's thoughts on an efficient vertex cache system.

That system might work well if you never need to evict data from the cache or update data (LOD changes). In my case I have a limited cache (24MB) and as I move across my level/scene new geometry becomes visible which needs to be cached (plus LOD updates) and there is old geometry in the cache which can be dumped. The fixed number of slots per size aren't good here either.

The types of VBOs (dynamic, static, streaming) are merely hints to the driver how you are going to use them and dividing them up probably isn't going to affect performance much if at all. In my case its all dynamic.

As for VBO sizes, there is no optimal size. I believe over 6 or 8 MB they become more ineffecient (not sure about the low range). Just play with them until you find sizes/counts that work well[smile].

HTH

I implemented a quick and dirty version of the system I outlined above this afternoon. Unfortunately, I haven't had time to really test it yet, but so far it works.

I didn't mention it in my above post, but slots can be freed and returned to the free pool. If there is no slot available for a particular size, another appropriately sized and partitioned buffer will be allocated and placed in the free pool.

Currently I'm allocating buffers of only one megabyte, so perhaps I should up that to about five and see what I get. Of course, I'm worried about the memory overhead of that many slot info structures (they have a few 32bit uints, and a pointer).

Perhaps I should only add new slot structures when they are requested, rather than creating them at buffer creation time. Hmm...