In a recent thread I mentioned/came up with a method of drawing tiles without needing a quad for each tile to avoid a vertex bottleneck by using a fragment shader. I've implemented it.

The whole map in this test program is 2048x2048 tiles, and they are all drawn as a single quad (no culling on the CPU). Zooming out so that all tiles are visible, FPS drops to around 450-500 as the texture cache won't be able to do its magic. However, in that case tiles are smaller than screen pixels, and the whole map just looks like randomly colored pixels.

The renderer is not limited to pure 2D rendering in any way. It's possible to just rotate and scale the rendered quad to achieve lots of effects. For example, isometric tiles:

It uses an RGB texture to keep tile indices per tile and a 2D texture array for tiles to prevent bleeding between tiles when linear filtering is used. The most important part is the fragment shader:

which calculates the tile from 2 color channels and then looks up the tile in the texture array. This could be implemented in OpenGL 2 too by using a pure tile set, but it would have severe bleeding between bordering tiles when using filtering. However, mipmaps do not work as they produce weird 2 pixel seams between tiles for me.

Here is the Java and GLSL shader source, plus a test tileset: http://www.mediafire.com/?flc36t8uq6floaw You need the LWJGL jars in the classpath and the natives directory in the VM commands. It also needs OpenGL 3 to run, but it is possible to work around this.

libtcod's shader renderer uses this trick too, though your code is a lot easier to follow. The libtcod code uses some nasty tricks to deal with correcting for NPOT map dimensions, whereas I think just starting with a POT texture in the first place is a better idea (I guess it might get a bit more expensive if you had a 257x257 map, but eh, edge case).

For a true isometric view you really need a custom tileset designed for it, but it's still nice to be able to pull off arbitrary scaling.

libtcod's shader renderer uses this trick too, though your code is a lot easier to follow. The libtcod code uses some nasty tricks to deal with correcting for NPOT map dimensions, whereas I think just starting with a POT texture in the first place is a better idea (I guess it might get a bit more expensive if you had a 257x257 map, but eh, edge case).

For a true isometric view you really need a custom tileset designed for it, but it's still nice to be able to pull off arbitrary scaling.

I'm sure you can modify the texture loader and fragment shader to sample from a special isometric tile texture. =D

EDIT: Why would NPOT map dimensions be a problem? Textures have supported non-power of 2 textures since a long time ago.

NPOT textures are indeed supported, but operations on them can be considerably slower. Maybe the libtcod guys were coding for nvidia 5xxx cards which claimed to implement GL2.0 but didn't support NPOT textures, I dunno. At any rate, nothing that anyone need concern oneself over when already requiring GL 3.0.

NPOT textures are indeed supported, but operations on them can be considerably slower. Maybe the libtcod guys were coding for nvidia 5xxx cards which claimed to implement GL2.0 but didn't support NPOT textures, I dunno. At any rate, nothing that anyone need concern oneself over when already requiring GL 3.0.

The only thing that actually requires GL3.0 is texture arrays. I was thinking of using them to prevent bleeding between tiles, especially when using mipmaps, but for some reason I get weird seams between tiles that seem to have a random color from the tile I'm drawing without any good reason, and it only happens when I enable mipmaps, regardless of if I use GL_LINEAR or GL_NEAREST. That means that you could get the exact same result by using a single huge 2D texture with the tileset instead of a 2D texture array as long as you eliminate bleeding by enabling GL_NEAREST or adding a border around each tile. The real problem is generating texture coordinates from the tile index in the fragment shader, and I suspect that performance could suffer a little because of this. Using a texture array, I can just pass in the tile index as an integer to the sampler and get the correct layer, which allows me determine the tile index and get the tile texture in only 3 lines.

Anyway, it's nice to see people interested in this. =D

EDIT: Facepalm! There's a small "bug" in the shader. It's supposed to be float tile = dot(tileResult, vec2(65536, 256.0)), not 65535!!! It'll probably round to the right number as lang as the index is under 32 000 though... xd

Holes should be easily doable with a designated tile id meaning "void tile" where the fragment shader calls discard. The "blending" for an alpha of 0 is pretty trivial though, and I wouldn't be surprised if it was just as fast.

Time to resurrect this thread once again! I've fixed the seam problem once and for all!

The cause (skip this if you don't care)

There was actually two different problems causing this. The implementation above actually had a very minor problem with "seams" when scaling veeeery slowly, but it was very difficult to notice and only visible with subpixel scaling or translation. The other problem was triggered by using mipmaps. The generated texture coordinates wreaked havoc on OpenGL's built-in LOD selection (which mip-level to use).

The universal seam problem

This was caused by me expecting floating point math to make sense. Rounding problems suck. There was a veeeeeery small chance that with extreme edge cases the texture filtering on the tile index lookup texture returned the index for one tile while the local texture coordinate generation calculated texture coordinates for a different tile.

Notice how the white at the top of the tile also appears at the bottom seam of the tile. The tile index was gotten from the center tile, but the local texture coordinates were calculated for the tile below.

I solved this by simply storing the X and Y coordinates of each tile in the tile index texture too. That way the local texture coordinates will always be for the tile index fetched. Now the tile index texture is a GL_RGB16 texture. Tile indices are stored in the the red channel, and map X and Y is stored in the green and blue channels.

The mipmap seam problem

This was a lot harder to track down, but after working on a per-pixel distortion shader which also used dependent texture reads I realized the problem. It's due to how OpenGL calculates which mip-level to sample from. Basically OpenGL calculates the mip level to use by checking how the texture coordinates change over 2x2 pixel area. This allows it to determine how quickly the texture coordinates change, and can then pick a mip level depending on the texture size. It also allows anisotropic filtering to work. However, it's possible to confuse OpenGL into picking the wrong LOD value, and this is exactly what's happening for my generated texture coordinates.

These are the local texture coordinates of each tile. The problem are the edges, because the texture coordinates aren't continuous there. Since it checks the values over a 2x2 area the rate of change might be calculated over 2 or even 4 different tiles, each having vastly different values (one close to 1, one close 0). The result is that the shader samples from a very small mip level for edges, usually the smallest one. I solved this by calculating the LOD value on the CPU (very easy), sending this value as a uniform to the shader and sampling from the texture with texture2DArrayLod() with the precalculated LOD value.

The important Java code changes include: - The tile index texture now is a GL_RGB16 texture which contains (tileIndex, x, y) per tile. The level generation and single tile updating code has been updated. - Mip maps have been generated and enabled (the code was already there, just commented out). - Texture LOD is calculated and passed on to the tile renderer from Java. A GLSL uniform for this is updated each frame. LOD is calculated with the following code:

Mostly unchanged. 0.5 to 1.0 milliseconds for a fullscreen quad on mid-range hardware (1000 - 2000) FPS. Highest seen was just under 3 millisecond (370 FPS) for extremely zoomed out views (over 1 million tiles visible). Enabling mipmaps slightly improves performance for zoomed out views since smaller textures are used.EDIT: I enabled SLI on my GTX 295 for the test program and ran it at 1920x1080 in fullscreen. On the default zoom level I got 3000 FPS and really scary whistling sound from my graphics card... High FPS = scary. o_O

Compatibility

I looked up the texture array extension, and it's supported by OGL2 level AMD cards, but not NVidia cards. In other words, this program requires a DX9 AMD card or a DX10 Nvidia card = an AMD HD2000+ series card or an Nvidia 8000+ series card. It's possible to ditch the texture array, but it requires some pretty big changes in the shader to pick out tiles directly from a normal 2D texture and it breaks mipmap and bilinear interpolation support since you'll get bleeding between tiles. However, that would decrease the requirement to any card supporting shaders.

Sigh. Every time someone mentions this article I end up adding something new to it. I think this'll be the last new feature I add though. Basically I added bilinear filtering between tiles. Check out these comparisons:

Note that there are two different comparisons (the little tab on the top left). The first one is 100% sharp bilinear interpolation, and looks like quad rendered tiles with 256x antialiasing. The second version imitates the filtering done by the GPU to make the whole tile world look like one continuous filtered image without any discontinuities between tiles, something that is impossible to achieve when drawing tiles with quads.

The new shader basically checks the closest 4 tiles instead of just one and does bilinear filtering between the 4 samples if the pixel is on the edge between them. This is all done in the shader of course. The only thing you need to add on the CPU side is that you need to set a shader uniform ("scale") to how big a tile is onscreen in pixels. This was already available in the test program above since it's basically the current zoom level (currentScale), so I just passed that in. This value should be clamped so it isn't below 1, since when tiles cover less than 1 pixel it's just going to be shimmering anyway. To achieve the super smooth version without any discontinuities just clamp the value to a maximum of <tile size>. This all works with isometric tiles too. The result is perfect antialiasing and no shimmering when the camera moves or zooms. The difference is a LOT more noticeable in motion.

Basically we went from 0.5 -1.0 ms to 1.0-3.5 ms. This kind of performance is still very usable, especially since 2D games are almost always CPU limited, so it shouldn't even have an effect on FPS at all in a real game!

java-gaming.org is not responsible for the content posted by its members, including references to external websites,
and other references that may or may not have a relation with our primarily
gaming and game production oriented community.
inquiries and complaints can be sent via email to the info‑account of the
company managing the website of java‑gaming.org