Hey guys, I looking for some advice/insights into render batching. I've been building my own game engine for Android (OpenGL ES 2.0) and at the moment, I feel like I am handling my game object rendering very inefficiently.

Currently, I have GameObjects that have are stored in a list in a GameWorld. GameObjects have a reference to a texture and a mesh. At the moment, all I do is loop through the GameObjects in my renderer, binding textures and unbinding them object by object. My understanding is that binding is quite an expensive process. So, I guess my question is... does anyone have an elegant way to handle this?

Some of the ideas I had were:

- Sort the GameObjects list on the texture id (possibly also expensive to do) so that objects with the same texture render together

- Instead of an array with everything in it, GameObjects of the same type are kept in their own arrays, and each one of those is looped through with only one 'bind' call required per group.

I hope I was clear enough to create some idea of where I'm at and what I'm trying to do, I can't wait to hear your feedback.

In your case, for sprites, a render part is your GameObject, unless your characters are made of multiple sprites. Let them submit what they need to a render queue and let the render queue decide the order of drawing them.

There are two big considerations to have when optimizing render systems, my knowledge is PC based but for all I know mobile devices are subject to the same bottlenecks.

First is graphics device states, the configurations of how the graphics card should do things, changing these settings, such as Wireframe vs Solid fill mode, or which shader to use is expensive on its own, but also kills pipelining, in case you don't know what that is, think of it as assembly line process, as long as a buch of objects require the same processes, you don't need to wait for one object to finish completely before sending the next one, if the process has 5 steps, you can have 5 objects going through the pipeline at the same time, which saves A LOT of time.

So with that in mind, objects should be grouped toghether based on how similar their rendering process is, Textures are important, specially the large ones, but in my experience shaders and render states are more important.

The other thing to keep in mind is that while the graphics card will skip content that it resolves not to be within the drawing area on its own, the objects need to be in the graphics card for that to be evaluated and that is already a lot of wasted time, it is your responsibility to give the graphics card the least possible number of out of scope objects, this is whats called Potentially Visible Set optimization, and Space Partitioning structures are the fastest way to do it (as far as I know), you should read up on that. Doing it object by object in the CPU is almost as wastefull as letting the GPU do it.

A third thing you might want to be carefull about is clipping, when an object is only partially visible, that is, a part of it is within the view cube, the graphics device must cut the mesh to include only what is within the cube and ignore the rest, this is because if it didn't do so, the pixel shader would likely go through a hell of a lot of pixels that aren't visible anyway. Small objects are not a problem, since they are likely to fit within the view or be completely outside of it most of the time, but large objects, as say a mountain cliff, are likely to be partially visible a LOT, sometimes its better to make these meshes in such a way its easy to discard most of them. I haven't really dug into this concern myself, but you might want to read up on it, I think Polygon Clipping or Trimming are good search terms.

UPDATE: I've asked around with some engine developers, apparently the biggest bottleneck in consoles and PC right now, is texture size and count, however states may still be a big deal on mobile platforms.

Usually, most modern engines usually allow a number of objects to be outside the view window, and render any object they touch (leaving the level designer to make sure they break down big objects into pieces so those completely hidden are not rendered). If you use LODing, it doesn't matter much anyway, as they should not be on the screen often, and large objects will be low detail at distance.This is how CryEngine does it. Note, they also use culling objects (PC only) so level designers can further mask out portions that would otherwise get rendered using this technique,

Oops edit: Another reason, if memory serves, is even if you are rendering something off-screen -at the edge of the screen atm, it will likely be on screen in a short time.. My experience in CE3 is it's rather more liberal in it's default setting..

I'm not incredibly experienced, so this may be a royal pain on larger systems, but it may or may not be beneficial to try maintaining a list of meshes containing references to a particular texture or shader. Presumably you would place this list in the wrapper class for shaders and textures, and then when it comes time to render, your rendering list can go through the texture pool to grab each texture, bind it, then render the objects using that texture, unbind, then repeat the process with your other meshes. This way you can lessen some of the sorting pains that would normally occur at render time. Of course, you wouldn't necessarily be able to do that for anything requiring alpha blending, but that is a black art in of itself (I guess recent API updates to DirectX and OpenGL allow for order-independent rendering of sorts, but I don't think OpenGL ES has received this sort of attention yet).