Edit: This is just for my own learning experience, it is NOT for performance reasons that I ask this question.

This is in regards to a Minecraft-like terrain engine. I store blocks in chunks (16x256x16 blocks in a chunk). When I generate a chunk, I use multiple procedural techniques to set the terrain and place objects. While generating, I keep one 1D array for the full chunk (solid or not) and a separate 1D array of solid blocks.

After generation, I iterate through the solid blocks checking their neighbors so I only generate block faces that don't have solid neighbors. I store which faces to generate in their own list (that's 6 lists, one per possible face/normal). When rendering a chunk, I render all lists in the camera's current chunk and only the lists facing the camera in all other chunks. I do this by storing all 6 lists in one buffer, then I simply change what ranges I draw.

Using a 2D atlas with this little shader trick Andrew Russell suggested, I want to merge similar faces together completely. That is, if they are in the same list (same normal), are adjacent to each other, have the same light level, etc. I know I will still end up with rectangles, but it should easily reduce my vertices count by 50% or better if my estimations are correct.

My assumption would be to have each of the 6 lists sorted by the axis they rest on, then by the other two axes (the list for the top of a block would be sorted by it's Y value, then X, then Z).

With this alone, I could quite easily merge strips of faces, but I'm looking to merge more than just strips together when possible. I've read up on this greedy meshing algorithm, but I am having a lot of trouble understanding it.

So, my question: To perform merging of faces as described (ignoring whether it's a bad idea for dynamic terrain/lighting), is there perhaps an algorithm that is simpler to implement? I would also quite happily accept an answer that walks me through the greedy algorithm in a much simpler way (a link or explanation).

I don't mind a slight performance decrease if it's easier to implement or even if it's only a little better than just doing strips. I worry that most algorithms focus on triangles rather than quads and using a 2D atlas the way I am, I don't know that I could implement something triangle based with my current skills.

PS: I already frustum cull per chunk and as described, I also cull faces between solid blocks. I don't occlusion cull yet and may never.

*Edit: I have implemented my own little technique, which probably has a name, but I simply go through my 6 lists which are sorted by the axes they rest on, followed by block type, and then by lighting level. I iterate through them, creating new rectangles as I go and grow them simultaneously (with a heavy bias towards a certain axis). It is definitely not optimal, but it is indeed quite fast and it does lower my vertex count by close to 50% on average. Byte56's comment is the closet I think to a real answer, but I can't select that for the answer/bounty.

Here is my fast and simplistic way I'm handling this problem AFTER generating the full initial terrain without much of any optimization. Assuming all squares given are the same image, light level, normal, etc.. each color is a different quad I would render. With my lists sorted the way they are, this is a very easy/fast approach.

After all your optimizations, you're still running into performance issues? Have you tried profiling the code the see where the issues are?
–
Byte56♦Jul 3 '13 at 14:38

@Byte56 Oh, it's definitely not a performance issue at all currently. It may be one in the future when I attempt to put what I'm learning to use for a real game. For now though, I'm simply wanting to learn. I didn't state any of that because I seem to get less help if I'm just wanting to learn things simply to learn them.
–
Tim WinterJul 3 '13 at 14:51

I feel like I should also say that if I do get around to actually attempting a proper game with this, I will want the visible world to be HUGE. I would not want the character to be 2 blocks high and 1 block wide like Minecraft and many others, but more like 3 blocks wide/long and 6-9 high. Probably static terrain though.
–
Tim WinterJul 3 '13 at 14:56

1

Ah, OK Tim. Thanks for the explanation. I did some research into this when I was making the engine for my game. Since my landscape is dynamic, I didn't think it would be worth the performance cost every time something was changed. However, you may find some use from this article about the maximal rectangle problem. Essentially it's the algorithm you're talking about for merging similar faces (maybe not greedy, but optimal). Good luck!
–
Byte56♦Jul 3 '13 at 15:03

@Byte56 Thanks, I greatly appreciate it. That certainly looks closer to the kind of algorithm I'm looking for, if not spot on. I'll have to tinker with it a bit to see if I can get all maximum rectangles with a single run of it. I may throw a bounty up for this as well in a bit as I think both the question and a detailed answer could help many others in the future.
–
Tim WinterJul 5 '13 at 13:50

2 Answers
2

You only want to do strips, or at the very best, rectangles. There are no other shapes that will be solvable or useful to you.

I would also like to point out that by keeping 6 lists per chunk, that at any point you're going to be using 5 lists from the chunks in cardinal directions, and at least 3 in every other one. You're wasting time here, when not only can the video card do this optimisation for you, but even if you did it yourself, you would be better off keeping all the faces in one place, and comparing the normal to the direction to the camera (Research DOT product of vectors).

CaptainRedMuff's links on Greedy meshing that you've already seen is THE answer to your problem. It should also be apparent from that that you will end up with "Stripy" meshes, but they are perfectly effective, and coming with something more optimal will be more complicated, and you have expressed that the greedy mesh is already too complicated.

If you're having performance issues with a single chunk, it makes me think something else is greatly wrong.

My first guess would be that you are calling a draw operation way too often. You can only tell the graphics card to draw between 50 and 400~ times per second, depending on hardware. How many calls to either .setData or .draw____Primitives are you making?

Rectangles are what I want. Byte56 linked an algorithm much simpler to follow than greedy, but I can't accept a comment as an answer. I assumed he commented instead because he only linked to an algorithm rather than explaining it. I don't know what you mean regarding my lists. I typically have 1-2 draw calls per chunk. My lists are stored in one buffer per chunk. I change what ranges to draw per frame. I could be wrong, but sending unnecessary data to the GPU seems counterproductive. An extra draw or two per chunk seems better than sending more data to the gpu and forcing it to also cull.
–
Tim WinterJul 6 '13 at 11:31

I am not having performance issues. I simply enjoy learning these things. I will edit my question to say this, but like I also told Byte56, I get less help if what I'm trying to do isn't actually solving a problem. To answer your question, I render 200-260 chunks at most per frame currently. That's a pretty common 500ish DrawIndexedPrimitives per frame. I regularly get 100-130 FPS without vsync, making that 65,000 calls per second. I'm not trying to be rude, but little you said makes sense to me. I may not understand, so please elaborate if I misunderstood. Maybe it has to do with XNA?
–
Tim WinterJul 6 '13 at 11:47

Ah, sorry, I meant to say per frame, not per second. You're getting close to an upper limit there, and if you're only on terrain, you've pretty much run out.
–
PhilJul 6 '13 at 23:55

I +1'd your answer as it did make me think of a few improvements I could make, but nothing to do with the question. Also, I thought I should share an answer from Nathan Reed on the possible draw call limit link
–
Tim WinterJul 10 '13 at 18:37

I'll have to find the article I read that quoted 400-500 as a limit. A quick google didn't help me.
–
PhilJul 19 '13 at 11:56

The article that you link to doesn't actually give an algorithm for generating the mesh; it states a way to evaluate possible solutions.

Generally, though, the article gives a good summary: 1) the naive solution is bad, 2) there's an optimal solution, but it will be both time-consuming to write and to run, 3) a quick culling gives much better results (and this is all Minecraft itself does), and finally 4) there might be a smarter solution (the greedy algorithm).

The "solution" that you imply in the image in your question is the greedy algorithm. It seems your question is "did I implement his solution correctly?" to which I must answer "he didn't give a solution; you just solved this yourself."

Can your solution be better? Meh. Sure. Probably not worth it. I think occlusion culling or LOD would be better next steps. Minecraft wastes a good number of cycles drawing the surface even when you're underground, as well as drawing vast networks of cave and mine systems when you're on the surface. Once you've got a good mesh breakdown, yeah, getting rid of obscured surfaces can be a good thing -- but (e.g.) generally it's faster to let your video card do backface culling than to do it on the CPU.