The absolute fastest way to do this is to create render lists and render ALL objects in one function.

However this is not the easiest. But, at least, in Direct3D programming using this method you are guaranteed to only make Device state changes only when absolutely necessary and only when applicable to objects that need the device states changed. Altering device states for every single object is a major hack and a tremendous slow down.

For instance if you have several alpha blended objects and you use the method of a virtual draw....is every alpha blended object going to turn alpha blending on and then off again? If you have even 50 of these objects on screen then that is 100 device state changes for 1 render. Way too many.

In other words, state changes should NOT be made by the object itself, but by the render function that draws all the objects.

This is not the easiest thing to code, but it is very fast when you get it working right.

You should also only make texture changes when necessary in Direct3D. Set texture, render ALL polies that use that texture, set texture to another texture, render ALL polies that use that texture, etc, etc.

OpenGL is a bit different but using display lists and render lists are not foreign to OpenGL either.

Decide what objects are in the frustrum and which ones are not - ala quadtree, binary space partition tree, etc.

Don't worry about those that are partially in and partially out of the frustrum - it takes more time to subdivide these objects or clip them against the frustrum than it does for the video hardware to simply auto-clip them with onboard routines.

It was my understanding you only need to worry about sorting by Z order when working with primitives that use blending or transparency, otherwise the depth buffer handles that. I've even heard that when working with both textured and lighted polygons (whether built in lighting or fragment/vertex shaders) that it's actually faster to rely on the depth buffer, as opposed to using something such as the old school Doom BSP rendering algorithm (the painter's algorithm I believe it's called, where you rendering the furthest BSP leaf first, then traverse the BSP tree back to front rendering each leaf).

The way you outlined things Bubba certainly is the most efficient. I'm wondering what your renderer setup looks like. In mine, I have a Renderer class which holds pointers to every single type of item that can be rendered...each object in the application just adds itself to the renderer, then at the end of the frame the renderer draws everything it points to.

It was my understanding you only need to worry about sorting by Z order when working with primitives that use blending or transparency, otherwise the depth buffer handles that.

It does. However since you don't want to sort alpha blended objects inside of the render loop I would recommend pre-sorting all objects before entering the loop.

My system uses a vector which is not the most efficient. The render system iterates through the vector and renders the object. I tried to use one huge vertex buffer but so far have come up with several issues in doing this that are not easily solved - such as which sets of vertices to translate, etc., etc. This all requires either a dynamic vertex buffer array which is a bit slower than a static or you must compile the vertex buffer on the fly..which again is not the fastest thing in the world.

Textures for objects are stored in a global vector (actually a global instance of a class that handles the vector) and are each given an ID. ID's are then all that is used for the texture throughout the system. The actual IDirect3DTexture9 interfaces exist only in the CTextureManager class, but the texture ID can be used any number of times.

Animation objects are simply a list of texture ID's and frame times. The actual texture's for the animation are stored in the global texture vector.

Here is my render order:

Static non-alpha blended objects.

Static alpha-blended objects

Animated alpha-blended objects.

Each object vector is compiled with objects that share state blocks. So the renderer looks at the state block for the vector or list, changes the states if need be, and then renders all objects in the vector. So each instance of the vector has a state block that determines what Direct3D states are required to correctly render the object.

If pixel shaders or vertex shaders are needed, there are members of the state block that handle these as well as the name of the shader so that it can be loaded. So far I've not put in support for getting shaders from compiled effect files.

State blocks are really not that hard to compile since I make it easier by defining several types of objects. These types of objects can be then used in the script or data file that the object is in. I do not yet have a file format to hold all of the objects in, but I'm working on it.
Each object can be a member of several pre-defined state blocks using one word which identifies which type of object it is, or it can use the CUSTOM state block which automatically places it in the custom state block vector. These are slower to render because the state blocks can change at any time. Fortunately most objects needing to be rendered share a lot of the same device states.

My system uses a vector which is not the most efficient. The render system iterates through the vector and renders the object. I tried to use one huge vertex buffer but so far have come up with several issues in doing this that are not easily solved - such as which sets of vertices to translate, etc., etc. This all requires either a dynamic vertex buffer array which is a bit slower than a static or you must compile the vertex buffer on the fly..which again is not the fastest thing in the world.

I have run into similar issues and have settled on a similiar setup. My setup is also not the most efficient (but it is more efficient than 'immediate mode' drawing where everything renders itself upon its Draw() hook). I have a vector for each type of object that I want to draw. I currently empty the vectors upon completion of the frame, although I could easily just leave it and save the speed.

Textures for objects are stored in a global vector (actually a global instance of a class that handles the vector) and are each given an ID. ID's are then all that is used for the texture throughout the system. The actual IDirect3DTexture9 interfaces exist only in the CTextureManager class, but the texture ID can be used any number of times.

Again, same with me, textures are handled by a global texture manager. I use OpenGL, and the 'actual' textures are stored within the class, and the texture manager just hands out IDs...this makes it so you cannot load the same texture into memory twice (so if you do try to do that, the texture manager just hands out the ID to the previous copy of the texture...upon finding a texture that does not exist, the texture manager hands out an 'error' texture, which currently says 'you suck at life').

BobMcGee, I like your approach. So you still call the Draw() method on the Knight class, but that Draw method simply tells your rendered "Draw me!" correct? A good approach.

Correct, verbatim from what you said. While it is not perfect, it is pretty efficient and seems similar (if not identical?) to bubba's approach. Everything is also MUCH more reusable, and I never ever have to worry about choreographing the graphics anymore...if I want something drawn, I just add it to the renderer...fire and forget baby. Note that my renderer has a different data type to hold the data for each object to be drawn, for example, here's a data container for an MS3D model:

Note that I do this every frame, but if I changed my setup just a tad I wouldn't have to (rather, I would just add stuff to the renderer as needed, but instead of removing it every frame just flag it as visible or not and save the time it takes to call new and delete all the damn time).

The biggest problem is that whenever I want a new 'type' of thing to be drawn, I have to add the following to my renderer:

-A data structure to hold it to distinguish it from other types
-A std::vector<DataTypeAbove> to the renderer class
-A function to add the type to the renderer
-The actual implementation of the rendering algorithm

Perhaps some of that *could* be alleviated by using polymorphism and OOP, but what I have works and I lead a conservative lifestyle.

"How to render your art in an object oriented way" sticky post :d, but not quite

So, I can do a few things....:

1. I could, make a class with pointers to every object I could render, which creates a centralized render object...

2. I could make a class with a single bool and have my draw functions::class to inherit all that good stuff they need....

3. I could make a class with lots of drawing functions and couple the functions with if statements....

4. I could even do what Bubba said... Hmm, I believe he said to make render lists.... I.E a list of pointers pointing to several objects which draw something, and then use the pointers to render in that one function, kinda like #1....

These are some serious ways to use object oriented programming, making the most optimized code you can, thats what its all about, thus why...

"If it isn't written in C++, it isn't fast enough."

Yes, these are the basics and I AM asking about them, so please no more back to the basics posts , they make me feel incompetant.....

The only real answer is to find what works for yourself. I've tried just about every version listed in this thread and invented others, most of which just kind of failed miserably. However, that means that the product I have right now is the result of having spent a lot of time and energy thinking about and implementing this junk.

Bob is right. I've tried just about every other method in this thread. The first one was the virtual Draw() where each object was responsible for drawing itself. But Direct3D just does not lend itself to this type of approach since it is basically a simple state machine. So state blocks work much better. And from my assembly language experience I also know that calling functions has inherent overhead and doing this a lot in the main render loop is not good. So if you do a Device->SomeFunctionCall() 20 times in a render loop as opposed to 10000 times, you will certainly gain CPU cycles and thus frames.

Since what Bob and myself are saying is nearly lining up perfectly with what I've seen recommended in the SDK, I'd say that is the path you would want to go down. Save yourself the trouble because moving from an object-based render system to a centralized render system.......is NOT an easy transition. Save yourself some time and dump the object-render now while you are still in the early stages of the engine.

This new approach will take a lot more time and a lot more thought process, but once you get it done I guarantee it will be much faster. This, in turn, frees up frames for some very cool effects, scripts, AI, or whatever it is you need the power for. Idealy you only want the render portion of the engine to take up as many cycles as is absolutely necessary. After all, a game is far more than just the rendering and graphics. There is a lot more going on and trust me you need the cycles to get it all done right.

Now don't get me wrong in small games or demos you can get away with an object render.

My old XSpace engine was an object-based render that had no frustrum culling. It could handle over 3500 laser quads and still maintain around 40 frames per second, while also drawing 20 planets with 400 triangles per, and an alpha blended sun along with a simple dot3 windshield glare effect. But I changed it...and as of yet have no new demos of the system in place because it has been a very tedious process transitioning from object based rendering to centralized rendering.

Since what Bob and myself are saying is nearly lining up perfectly with what I've seen recommended in the SDK

What SDK are you referring to

Save yourself the trouble because moving from an object-based render system to a centralized render system.......is NOT an easy transition

Agreed

This, in turn, frees up frames for some very cool effects, scripts, AI, or whatever it is you need the power for.

I have been agonizing on how to procede with my design for implementing higher level stuff. For example, you can modify the Texture matrix such that you can have texture animations (typically compiled from scripts). Do I add an 'animated texture polygon' along with the texture animation information to the renderer class and let the renderer animate the textures? Or do I create a new data type that just accepts a new texture matrix as a parameter and let each object animate itself outside of the renderer and then it just sends itself (along with the new texture matrix) to the renderer. I'm trying to keep my renderer dumb and primitive and not implement anything too high level.

Some other things I've been agonizing over:

-How to keep track of lighting and material information. I considered writing another class called a Material manager which, similarly to the texture manager, keeps track of all of the materials which can be lighted. When my renderer binds a texture, it asks the TextureManager for the texture. Similarly, I might let the renderer ask a MaterialManager for lighting/material information.

-How to keep track of stencil shadow volumes. I recently wrote routines for finding triangle neighbors and then subseqeuntly silhouette edges (which are then used to build the shadow volumes) in a MS3D model. Actually rendering the shadow volume, using Carmack's reverse, is pretty trivial as it's (relatively) simple (it uses the depth buffer instead of ray casting to determine what is and isn't in the shadow and writes the results to the stencil buffer, such that when going through and writing your lighting routines with fragment shaders you can determine what to light and what not to light).