Rendering terrain - multiple draw calls or draw whole VBO?

I have a simple terrain based on a perlin noise heightmap that is about 64 times larger than what will be rendered on screen (8 times in each direction). It is currently stored in a single VBO and I was wondering what the optimal way to draw it is.

The look I am going for is a 3D version of Settlers 1, with the camera at a 45degree(ish) angle, so I am not interested in LOD mapping or anything like that at the moment, since I don't think the view distance will warrant it.

I guess the question is really - how good is OpenGL at discarding vertices outside of the viewing frustrum?

I know that draw calls are expensive, so should I just fire the whole terrain into a single draw call and let OpenGL deal with the vertices that won't be shown, or should I spend some CPU time calculating which sections need to be drawn and performing multiple draw calls for only the parts of the terrain I need? I've seen mention of glMultiDraw*(), which looks like it might be the best of both worlds, but I haven't used it yet - should I look into this?

I also want my terrain to wrap - is it worth using the extra memory to extend the repeated terrain in the VBO, or are the extra 3 draw calls (if wrapping in both directions) not going to affect me too much if I redraw the same terrain model with a shifted model matrix? In this case the first option above would mean drawing about 256 times what is on screen :/

I know I am pre-optimising, since I've just started the project (my first in OpenGL), but I'd rather get it right from the start.

Do you mean 2D as in a mostly flat terrain, or as a reference to Settlers' original implementation? Although I am copying the look to an extent, it will be a 3D world with the ability to rotate the camera and such.

I guess the question is really - how good is OpenGL at discarding vertices outside of the viewing frustrum?

It can't discard vertices which are outside of the viewing frustum, only entire primitives which don't intersect the viewing frustum. Also, the tests can only be performed after vertex processing. A client-side frustum test will normally be worthwhile for any non-trivial scene.

Originally Posted by rakketh

I know that draw calls are expensive, so should I just fire the whole terrain into a single draw call and let OpenGL deal with the vertices that won't be shown,

That's less than ideal, but with modern hardware you will probably get away with it unless the terrain is vast.

Originally Posted by rakketh

or should I spend some CPU time calculating which sections need to be drawn and performing multiple draw calls for only the parts of the terrain I need?

You don't necessarily need to use multiple draw calls; e.g. you could construct an index array and use a single glDrawElements() call.

Originally Posted by rakketh

I've seen mention of glMultiDraw*(), which looks like it might be the best of both worlds, but I haven't used it yet - should I look into this?

Ideally, yes. If you go that route, you still need to keep the number of "commands" reasonable, so I'd suggest making the chunks roughly the size of of the visible area, so that you're typically rendering around 3x3 chunks at a time. If the chunks are too small, the increased per-command overhead will have a cost. If they're too large, the increase in the total number of triangles will have a cost. You'll need to experiment to determine the optimal size.

Originally Posted by rakketh

I also want my terrain to wrap - is it worth using the extra memory to extend the repeated terrain in the VBO,

If you end up using a single glDrawArrays() call, then probably.

Originally Posted by rakketh

or are the extra 3 draw calls (if wrapping in both directions) not going to affect me too much if I redraw the same terrain model with a shifted model matrix? In this case the first option above would mean drawing about 256 times what is on screen :/

I wouldn't recommend drawing the entire terrain four times.

Another option is to just draw a fixed-size grid with dummy data (attribute zero has to be bound, but it can be one byte per vertex, or even one byte in total if you use instancing), and generate everything in the vertex shader from data stored in textures.

There is some performance penalty for using textures, as texture lookups are random-access whereas attribute data can be pipelined more efficiently. But it may well be less than the alternatives.

You don't necessarily need to use multiple draw calls; e.g. you could construct an index array and use a single glDrawElements() call.

I'm currently binding my indices to a VBO (GL_ELEMENT_ARRAY_BUFFER), so I just render the whole thing. Are you saying it would be more efficient to send the index data each time instead of binding it to video memory in order to ask for a drawing of fewer vertices?

Originally Posted by GClements

Ideally, yes. If you go that route, you still need to keep the number of "commands" reasonable

What do you mean by "command" in this instance? I would imagine glMultiDraw would draw all of the sets in the same draw call. It would seem like a waste of an addition to the language if it is just equivalent to calling draw for each set.

Originally Posted by GClements

Another option is to just draw a fixed-size grid with dummy data (attribute zero has to be bound, but it can be one byte per vertex, or even one byte in total if you use instancing), and generate everything in the vertex shader from data stored in textures.

I'll look into this, as it sounds interesting (do you have any links?), but it sounds too advanced for what I am trying to do.

I'm currently binding my indices to a VBO (GL_ELEMENT_ARRAY_BUFFER), so I just render the whole thing. Are you saying it would be more efficient to send the index data each time instead of binding it to video memory in order to ask for a drawing of fewer vertices?

More efficient than what? A single draw call with the index array updated each time could be more efficient than multiple draw calls with fixed index arrays, or than drawing far more than is visible. Or it might not. It all depends upon the details.

Originally Posted by rakketh

What do you mean by "command" in this instance?

The individual "batches" within a multi-draw call. The things which would be individual draw calls if you didn't have glMultiDrawElements etc.

Originally Posted by rakketh

I would imagine glMultiDraw would draw all of the sets in the same draw call. It would seem like a waste of an addition to the language if it is just equivalent to calling draw for each set.

I would expect there to be some overhead for each batch, making a multi-draw call not quite as fast as a simpler call, but still faster than multiple draw calls.