2011-09-29

I've been doing some Javascript recently (well, Coffeescript actually, but this applies equally to either, so I'll stick to Javascript for the examples) and found its approach to objects rather interesting. I think it might be instructive to compare Javascript's objects to Python's, the dynamic programming language I'm most familiar with.

First of all, take this Javascript code, which creates a factory-function called "Counter" which returns objects with "increment" and "decrement" methods:

In our outermost scope (the identity of which I'll gloss over for now) we have three names (or keys) defined. "Counter" is a function object, while "c1" and "c2" are both generic objects. "Counter", like all functions, has an object attached to it under the key "prototype", and it's in this prototype object that we've inserted the functions increment and decrement.

Each time we invoke "new Counter()" a new object is created with its special "__proto__" key mapped to Counter.prototype, and then the Counter function is executed with "this" bound to the new object. The Counter function adds to that object, mapping "value" to the number 0. (I must say, I still don't entirely understand why "new" works like this - I can't see any value in being able to invoke the same function both with and without "new".)

Finally we invoke "c2.increment()". First Javascript looks on the c2 object for the key "increment", but there is no such key. Next, it follows the "__proto__" key and tries again with the prototype object. There it finds the increment function. The function is executed with "this" bound to the "c2" object. When it does "this.value += 1" it first looks up "value" in the same way, finding it directly on the object itself, then after adding 1 to the value it stores it back on the object. (Note that storing something doesn't involve "__proto__" traversal. It always goes straight into the object that you put it into. It's only look-ups that involve "__proto__" traversal.)

At a first glance, things look pretty similar. The "__class__" attributes on the objects are much like the special "__proto__" attributes in Javascript. The "__init__" method is quite similar to the "Counter" function in Javascript. The differences are subtle.

When Python executes the class statement, the three functions "__init__", "increment" and "decrement" are defined and attached to the newly created class object.

When "Counter()" is invoked a bunch of stuff goes on behind the scenes, but importantly a new object instance is created with "__class__" pointing to the class and then the "__init__" function is invoked with the object instance as its first argument.

When "c2.increment()" is executed, first "increment" is looked up on c2. This fails, so next Python follows "__class__" and checks for "increment" on the class. However, something special happens when it finds it there. When an attribute lookup for an object finds a function on its class object, it instead returns a "bound method", which is effectively a function that takes one fewer argument then passes through to the "increment" function itself, inserting a reference to the object as the first argument. So even though the statement "c2.increment()" provides no arguments, when "increment" is actually executed, it receives c2 as its first argument.

There are, I think, two interesting differences here. One is what happens when we want to represent inheritance, which my example doesn't really cover. In Javascript, if attribute lookup fails on the prototype then the prototype's prototype is searched, and so on. In Python, if the attribute lookup fails on the class, then instead of looking at its "__class__" attribute, the base classes are searched. (And since Python allows multiple inheritance, the corner cases are a bit tangled.)

But what I think is most interesting are the different approaches to how methods know their receiving object and the unintuitive consequences of these mechanisms. In Javascript, when a function is retrieved from an object and invoked on the same expression, it triggers the binding of "this" to the object in the function. But if you take a reference to the function and invoke it separately then "this" remains "undefined". For example:

c1.increment();

is not equivalent to this:

var f = c1.increment;
f();

The former will increment c1. The latter will do nothing because increment executes with this undefined.

In Python, on the other hand, the equivalent will work as you might expect, because "c1.increment" in fact refers to a bound method. However, from this we can see that in Python "Counter.increment" is not the same entity as "c1.increment". "Counter.increment" is a function that takes one argument. "c1.increment" is a bound method that takes no arguments. In contrast, in Javascript, "Counter.prototype.increment" refers to the exact same object as "c1.increment". So Python does the magic during object attribute lookup, while Javascript does the magic when invoking an attribute as a function.

I think this is all correct, but do let me know if I've made a mistake. I'm finding Javascript a lot more pleasant than I expected, except for its cheerful silent propagation of "undefined" at every opportunity, and Coffeescript definitely rounds off some of the rough corners. In fact, I'd say I even slightly prefer Coffeescript's class notation to Python's.

I'm curious to know if there are any particular benefits of prototype-based objects. I can't see any clear advantages over class-based objects, and I particularly don't understand the benefit of being able to use any function to create an object. Is it just nice because it results in a smaller language? (Does it result in a smaller language?)

Okay, you're back? I'll try not to state the answer outright, but it will probably become pretty obvious very quickly. Right, so this got me thinking about sums of series. The case in the question is sufficiently small that you don't even need to work out the formula for the sum of sequence in question, but I was thinking about it nonetheless.

Now, I knew the formula for the sum of an arithmetic sequence, and I knew that you could easily enough prove it by induction, but this is a very unsatisfying form of proof. It doesn't feel like something you could have figured out on your own. Imagine just picking random equations one at a time and testing them until you find one that you can prove correct with induction! I thought about it a little and came up with what I thought was a pretty compelling geometric proof, at least for the simple case where you start at 1 and increment by 1. (I have no idea if a proper mathematician would call this a proof. I am certainly not a proper mathematician.)

Is it obvious? On the left in the green we have a square with area 1, a rectangle with area 2, a rectangle with area 3, etc. The area of the whole shape must be the sum of those areas. By dividing up the space differently, we can easily see how to produce the closed form.

Great, I thought. Now lets do the same for the sum of squares. It must be just as obvious in 3D. Right?

Hmmm. Well, it gives the right answer, and it's satisfying to figure it out by hand, but it lacks the obviousness of the diagram for the arithmetic sequence sum. Perhaps there's a simpler way to do it?

As an aside, I think I'm getting the hang of Inkscape, although there are still some areas I find I much prefer Visio. Inkscape has some pretty painful and inflexible handling of arrowheads, for example. They're always black, regardless of the line colour, and then there's a special menu action just to make them change to match the line colour. The selection of arrowheads available is limited and weird, and most of them aren't resizable. I'm not sure if that's because SVG forces arrowheads to behave in strange and unintuitive ways, or that's just the way that Inkscape chooses to work.

2011-07-12

Last time, I discussed ambient occlusion for adding depth to the map. Today I'm going to explain how the fragment shader rotates and flips oriented blocks like minecart tracks.

As you may have noticed, there's only one block ID for regular minecart tracks. The 4-bit data value encodes the orientation of the track, from 10 possibilities - 2 orientations of straight track, 4 orientations of corner track, and 4 orientations of sloped track.

A minecraft texture pack includes only two textures for normal minecart tracks: one curved and one straight. We need to use these to make up all the other orientations. We could either do that in advance - building up a bigger texture atlas with all the orientations we need, or we can do it in the shader. By doing it in the shader we avoid the need for a much expanded texture atlas and some fiddly work to assemble textures, at the cost of extra work in the shader.

To let the shader know what orientation to use, we encode it in the blue channel of our map data texture. We use 2 bits to encode a rotation (0°, 90°, 180° 270°) and 1 bit to encode a possible flip. (I don't think we actually use the flip at the moment, but it might be useful for doors.)

The shader code isn't actually terribly exciting. I should however point out that this
version treats the map texture as having normalized float values, rather than integers. For most of the earlier posts I had been updating the code to use integer textures to simplify things a bit, but I haven't gotten around to doing that with this code. That's why we multiply the sample values by 255.0 – as normalized floats they've already been rescaled from 0 to 255 to the range 0.0 to 1.0.

We form a rotation matrix with the rotation value of 0, 0.5π, π or 1.5π. The sines and cosines just work out as -1, 0 or +1. There's probably a cheaper way to calculate them.

This will rotate around the origin. Before we transform our texture coordinate we rescale it to the range -1..1, then adjust it back afterwards.

Some mention should be given to how we obtain the rotation values in the first place. There's no trick – there's just a great big table mapping block id and data values to texture coordinates and orientations. The same table is used to pick coloured wool values.

And here's the end result. I think I've talked about most of the things I had originally set out to cover. Next week I might touch on light levels, but there's not a lot to say there. If I find the time I'd like to do some work on cave rendering, but I'm not sure I'll both find the time to work on it and write a blog post, so next week's post may be quite short.

2011-07-05

Last time I described in principle how the ambient occlusion shader works, but didn't really go into the details. Today I'll go into a bit more detail.

Something that I think may not have been clear last time is that we're calculating ambient occlusion independently from lighting based on the "blocklight" and "skylight" values. For ambient occlusion we're considering only the arrangement of solid blocks in the immediate neighbourhood, not any genuine sources of light. As a separate step we later multiply together the ambient occlusion value and the light-level based on blocklight and skylight. Again, this is a simplification, and it does result in extra shadows in dark areas, but it generally looks fine.

As we saw previously, to determine the the level of shadow added by occlusion, we determine how much "sky" is visible from the rendered surface, where the sky is actually a horizontal square not very far above the surface. This screenshot shows the view from the ground looking up. The white blocks represent the sky square, and the stone slabs are neighbouring taller cells.

The yellow outline shows the unoccluded area of sky. The following diagram breaks that down into the areas shadowed by each of the eight neihbouring columns:

Area A is the occlusion from a neighbouring corner block, while areas B and C are occlusions from neighbouring side blocks. You'll observe that 1. the corner blocks always occlude rectangular regions of sky; 2. the side blocks occlude trapezoidal areas and 3. the corner block occlusions can overlap with the neighbouring side blocks.

Now, I have to admit that I made a bit of a mistake when I first worked this out, and didn't realise that the side blocks were occluding trapezoids, so I worked it as if they were rectangles instead, like this diagram:

The problem with making this as a simplification is that it will result in significant discontinuities around corners. Encountering these, but not understanding why, I experimented with tweaking values and if you look in the shader code you'll see a comment, "TODO: Figure out why the 2* multiplier is needed in the next line!" I haven't actually figured out why that works yet, but I plan to work through the maths again with trapezoids, if it's not too complicated, and see how it compares. However, for now I'm going to discuss the slightly broken version with the rectangles.

Here's another pair of diagrams of those occlusion rectangles:

On the left we have a similar case with some extra occluding columns. Things to note here are that the yellow corner column provides no extra occlusion because it is shorter than both of the neighbouring side columns. The dotted line shows the division of the sky square into quadrants, each of which we'll handle separately.

On the right diagram, you can see the same arrangement, but different rectangles are highlighted. The orange "+" rectangle is the area of unoccluded sky in the quadrant ignoring the corner column. This can be easily calculated based on the proximity of the neighbouring cells on the sides, and the heights of the columns in those cells. The teal "−" rectangle is the extra occlusion provided by the corner column, which we subtract from the unoccluded area to find the total unoccluded area in the quadrant. We calculate the teal area similarly, using the proximity of the neighbouring side cells, and the height of the corner column, but we make sure to clamp its dimensions to be positive – if either dimension would be negative it means the corner block is not going to provide any occlusion and can be ignored.

I've left this a bit late, so I don't have formatted code snippets or a nice runnable example, but you can go look at the complete fragment shader in github. The obscurely named "lightcalc" calculates the dimension of one side of the orange rectangle based on the height and proximity of the neighbouring cell, with 1 being the maximum unoccluded size. "cornerlight" calculates the unoccluded area in the quadrant, again with 1 being entirely unoccluded.

This shader is pretty slow. It results in a noticeable frame-rate drop for me. I haven't yet investigated whether the cost is in the quantity of calculations or from sampling all eight neighbouring cells for their height, but I suspect it's the latter. If that's the case, we could preprocess the image and store information about each cell's neighbours in a compressed form, reducing the number of samples needed. Maybe an expert could suggest the most fruitful avenues to optimization.

Here's what it looks like without any textures:

Any thoughts? It occurs to me that I could very well have overlooked a much easier or a less horrifically simplified way to do this. I'll need to look into how feasible it is to properly calculate the trapezoidal areas, and also figure out why the bodge of doubling the teal area results in continous results (at least, continuous between cells of equal height; we don't want continuity across height changes – the whole point of this is to highlight height differences). Feel free to ask questions if it's not clear. I may yet do one more post on this topic if people would like further clarification.

2011-06-28

Today I'm going to talk about the approach to ambient occlusion used in the Minecraft mapper.

First of all, we assume that light falls uniformly from the sky, like on an overcast day. The amount of light illuminating a particular patch of ground is thus proportional to the amount of the sky that can be seen unobstructed from that point. The following diagrams show the general idea in 2D from the side:

The red point on top of a plateau has an unobstructed view of the sky, so it is maximally illuminated. The green point is only slightly occluded by the peak on the left. The blue point is substantially occluded on both sides, so it will be quite dark.

That's the general idea. The amount of illumination for every point will be proportional to the solid angle of the sky visible from that point. However, that's quite a lot of work to do. We're going to cheat a lot by making a number of simplifications:

We're going to ignore overhead obstructions and only consider the occlusion caused by neighbouring cells.

We're going to assume that the sky is in fact a horizontal plane 3m above the point we're rendering.

We're further going to consider only a 2m×2m square of that plane, and assume that the visible 2D area of that square is a reasonable approximation of the solid angle of sky visible from the point.

The following diagram gives an idea of what we're talking about, again looking at things side-on:

The reason these simplifications help is that we can now calculate the occlusion just from a height-field describing the heights of the cells that we're rendering. For each pixel, we consider the height of the cell it's in and the height of the neighbouring cells. For each neighbouring cell, we consider how much higher it is than the central cell, and how far the edge of that cell is horizontally from the point being rendered. Based on all that, we can calculate how much of the 2m square patch of sky is visible.

In this diagram the red point is a pixel for which we want to calculate the ambient occlusion. The pixel is very close to the tall cell to the north, which will occlude a fair bit of the sky. The medium sized block to the east is a bit further away and not as tall, so it won't occlude as much, and might not occlude any at all. The only other taller block, to the south-west, is sufficiently far away and small that it's not going to occlude any of the patch of sky we're considering.

Next time I might go over the maths, or I might set this aside for now and talk about how we flip and rotate tiles for rendering minecart tracks. Let me know if you have a preference.

2011-06-21

Previously, we put together some code to render a 2D array of tiles. But rendering a horizontal 2D slice of the 3D Minecraft world is not particularly helpful. Today we will use numpy to flatten the 3D terrain into a 2D map.

Essentially, we just want to cast a bunch of vertical rays towards the ground and find the spots where they meet the ground:

Note in particular that we want to allow the starting point to be in solid matter, and when it is, we don't want it to end until it first exits the solid matter and then hits the ground. This makes it easier to see relatively shallow caverns in amongst great depths of rock. In addition, we're going to allow specifying a height-field for the ray-starts. I think this will be useful later for mapping multi-level underground structures - I'm imagining either letting the user paint across the map with a "deeper" and a "shallower" brush, or calculating a starting height-field based on one or more seed points.

The blue ray doesn't hit any floor in the specified range. That's a cell that we'll either leave transparent or render some sort of "wall" texture into.

These two diagrams show different height-fields selected to catch different floors. Whereas the first image had some abrupt changes when the height-field of ray sources passed through floors, here we've avoided that by slightly changing the height-field.

That's the theory behind using a height-field here. In practice I haven't done much with the height-fields yet other than test that they work by making diagonal slices through the world.

This is the method on Volume that takes two height-fields, a low-limit one that specifies how far we let the rays go, and a high-limit one that specifies what height we shoot the rays from. It outputs another height-field of which cells contain the floors we want to render, and a Boolean mask array that distinguishes where we found floors from where we found none. Note that the input height-fields can be simple scalars - if so, they're expanded out to make height fields.

We use numpy.atleast_3d to extend the height-field (or height scalar) into a 3D array. This adds extra dimensions of size 1, and allows numpy to "broadcast" when we use it in an operation with another 3D array with large size.

This creates cell_depth as a 3D array of the same size as the volume except one cell shallower, and with each cell having a value equal to its depth. (Actually, "depth" is a bit misleading. It increases in the upward direction. "Altitude" may have been more appropriate.)

This takes all the blocks in the volume, minus the very top layer. We're going to use it in a Boolean expression shortly, so all that matters is if it's 0 or not. This method is used twice, once to search for non-transparent blocks, and again to search for partially transparent blocks. So when "include_transparent" is true, we just use the values of the blocks themselves, which will be non-zero for all cells apart from air. When "include_transparent" is false, we use a look-up table to decide whether each block is of a transparent or non-transparent type.

This creates "potential_floors" as another 3D Boolean array, where true cells indicate that there's a block we might want to render that's within the range of the ray. But this will still select multiple cells in any given vertical column...

Now we construct "good_floors" out of "potential_floors" and "potential_footspace", selecting all those cells in range where a solid cell has a transparent cell immediately above it. If we think back to the rays being cast, this corresponds to every point where the ray passes from air into a floor. We may still be selecting multiple cells in any given vertical column.

Here numpy.argmax does the magic. We're applying it to a Boolean array. In the case of ties, it always selects the cell with the lowest index. So if there are a mix of true and false values, it will pick the lowest index cell containing true, otherwise it will pick index 0. Since we actually want the highest index cell, we use a negative stride slice ("::-1") to vertically flip the array before the calculation.

Finally we calculate the mask by checking whether the cells we finally selected were previously identified as suitable floors, and we clamp our floor heights to fall within our specified range. I've forgotten now why we need to clamp the floor heights. Perhaps that's now redundant.

I'm sorry this is a bit brief. I think I'm going to continue this once a week instead of twice, with posts on Tuesdays, because I don't think I can quite keep up the same pace, and I'm spending a lot of the time I would have used to experiment with new code just putting together the posts and their diagrams.

Please feel free to ask questions if there's anything you're curious about, or make suggestions about better ways to do things. I think there are quite a lot of bits that could be a lot more elegant, especially where the code has grown very organically and still has odd the odd vestigial appendix that has become quite redundant.

2011-06-17

Last time we loaded a Minecraft file and rendered a big grid of numbers indicating the block ID of cells in the map. Those block IDs don't directly tell us which item in the texture atlas to use. Instead, we create a great big mapping table that says for each block ID which texture from the texture atlas to use. That's still not going to be perfect - for wool blocks we also need to consider the data value to pick the colour. For minecart tracks the data value will tell us the orientation of the tracks, but then we only have two textures - one straight and one corner! In the future we'll see how to write a pixel shader that flips and rotates them appropriately. Other blocks like signs and torches and doors don't have any appropriate texture for rendering a top-down view, so we'll need to create them. Most evil of all is redstone wire - for one thing, the orientation of the wire isn't encoded in the saved game at all – we need to calculate it by looking at surrounding tiles. In addition, Minecraft renders redstone wire using a small number of textures and some rather complicated operations to clip and colour them. I haven't yet tackled that problem, and I might just avoid it.

Today all we do is use the mapping table to look up the position in the texture atlas (which I'm calling the "texture code") for blocks from the same slice of the map as last time. I'm not going to discuss that in detail because it's not terribly exciting. Pull down the code from github, and copy your terrain.png file into the same directory before you run texture_demo.py. (I don't think I can legally include terrain.png in the repo unless I draw it all myself.) You'll see something like this:

Here you can see pink squares in areas of empty space, tree trunks, leaves in dark grey, grass in light grey, dirt in brown and stone in medium grey. The reason that the leaves and grass are grey is because the texture in terrain.png is greyscale and Minecraft colours it at run-time based on the biome. We might do that eventually, but it's probably easier just to use a custom terrain.png with green leaves and grass.

One thing that I would like to discuss is some of the stuff we're doing in numpy that's not really important just now, but will be much more important when we render more than just a flat horizontal slice of the world.

defget_cells_using_heightmap(source,heightmap):''' Given a 3D array, and a 2D heightmap, select cells from the 3D array using indices from the 2D heightmap. '''idx=[numpy.arange(dimension)fordimensioninsource.shape]idx=list(numpy.ix_(*idx))idx[2]=numpy.expand_dims(heightmap,2)returnnumpy.squeeze(source[idx])

We use this function to flatten the 3D array of blocks into a 2D array. Note that I will be referring to the dimensions of the 3D array as X, Z and Y, in that order, to match how Minecraft lays them out. The funcion is pretty hideous, but I couldn't find anywhere describing an easier way to do this in numpy. I think I found it here. Looking at it again I really don't like that it reuses the name idx to mean rather different things. Basically, it works like this:

We then feed them into numpy.ix_, which converts them into 3D arrays. The first has dimensions 512×1×1, the second 1×512×1 and the last 1×1×128, and each has the sequence of values that we previously fed in.

We throw away the third one and replace it with our height-field. If our height-field has less than two dimensions, we bump it up to two dimensions. (This lets us pass in a single value if we want a uniform height-field.)

The horrifying magic occurs when we use this bizarre collection of three arrays as the indices into source. First of all, numpy applies its broadcasting rules to expand all of index arrays to have matching dimension. The result of this is that all three of our indexing arrays end up with dimensions 512×512×1.

The dimensions of the indexing arrays determine the dimensions of the result of the indexing operation. Each element of the output array is determined by looking up the source array with the coordinates from the indexing arrays at the corresponding position. The X and Z indexing arrays basically just pass through the X and Z coordinates unmodified. The Y indexing array is our height field.

Finally, we use numpy.squeeze to discard the last dimension.

Now that I've spent the time to think about it and explain it, I think it could be made clearer. I think the use of ix_ followed by broadcasting is quite hard to follow. I think we could avoid the extra dimension and the squeeze too. I don't think there's any getting around the fancy indexing.

That's quite enough for today. I'm now undecided what I'll discuss next week. I'm really not sure who's reading and what sort of interests they have. Are you beginners or experts? Do you want tutorials and stuff that you can easily try at home, or would you prefer a faster paced tour of the fancy bits? Is it important to build things up a step at a time or would you prefer I skip over things to get to the good stuff? How painful is this numpy stuff? I could spend some time drawing up diagrams to better explain it, but that would probably mean it will be quite a while before we get to the interesting shader stuff. Let me know what you think in the comments.

2011-06-14

For storage, the current version of Minecraft breaks the world up into 512×512 cell regions and breaks regions up into 16×16 cell chunks. (These are the X×Z dimensions, regions and chunks are both a full 128 cells tall in the Y dimension.) Previous versions stored each chunk as a separate NBT file, but recent versions bundle up a whole region of chunks into a single MCR file. We can use this NBT parser for Python, but (as far as I can tell) it doesn't know anything about MCR files.

MCR files are really just containers for compressed NBT files. The format is described here: Beta Level Format. We can use that information to seek to the compressed chunk data, decompress it, then feed the decompressed data to the NBT parser.

The parsed NBT file gives us access to various pieces of data. This is described here: Chunk file format. We're most interested in "Blocks", the array of block IDs for the blocks in the chunk. Later on we'll also need "Data", because this distinguishes subtle details of certain blocks, such as the colour of wool blocks and the orientation of minecart track blocks. "SkyLight" and "BlockLight" will let us apply lighting to blocks based on illumination from the sky (skylight) and from torches, lava, glowstone and so on (blocklight).

To better manipulate this data, we're going to put it in a numpy array. We use a 3D "structured array" with fields for each of the four pieces of data we've identified. The class VolumeFactory provides various methods that create volumes:

Apart from the handling for non-existent chunks, load_chunk takes an already-loaded NBT file, and uses arrange_4bit and arrange_8bit to assemble that data into the numpy arrays. (Numpy doesn't allow elements crammed together on boundaries smaller than 1 byte, so we pad the 4-bit values out to 8-bits.)

Finally, load_region calls load_chunk repeatedly for the 32×32 chunks in a region. It uses numpy's array slicing to paste them together into one big array. Many regions won't be fully-explored, and so many of the chunks will be missing. Those are just filled in with zeroes.

The main program is largely unchanged from last week's zooming and panning one. All that is changed is how we fill the "minecraft_map" texture:

Here I load a minecraft region from a region file I have on my computer called "world/region/r.0.0.mcr". You can find your own region files on your Minecraft saves folder. (On Linux this is in "~/.minecraft/saves", I'm not sure where it ends up on other systems. Perhaps %APPDATA% on Windows.) For now, since we're not yet doing anything smart to flatten the Minecraft map, we just take a slice at y=70. This is just a little bit above sea-level, so it contains a mix of air and land. You can see the results in the annotated screenshot below:

As before, the full code is up on github. The new files are "minecraft_mapping.py" and "nbt_demo.py", the latter of which can be run. You'll need to edit it though, because the path to the Minecraft region file is hard-coded and you'll want to point it towards one of your own map files.

On Friday, we'll look at applying Minecraft's own textures to the map, and next week we'll get on to flattening the 3D map data into something (moderately) useful for a map.

2011-06-10

Last time we did some tiled rendering. However, it was pretty awkward to look at the results because we needed to edit the code and re-run it if we wanted to move around. Today we add zooming and panning controls.

Before we start looking at the details, I should point out that all the code is now available on github. That includes the previous two examples "opengl1.py" and "opengl2.py", as well as today's, "zoompan.py". If you look in there you can see everything I've written for this project, but be aware that most of it is very haphazard. I'm brushing off and tidying up the corresponding parts as I write each article. Feel free to ask questions about any of the code in there, but please don't beat me up too much for my coding style!

As you'll probably have noticed, all the code for rendering from a different position and zoom factor was already there. All we need to do is add handling for the mouse events that appropriately updates the camera position and zoom factor. Mouse buttons 4 and 5 are triggered by the scroll wheel. When it rolls up we double the zoom factor, and when it rolls down we halve it.

Mouse button 1 is the left (or primary) button. When it is pressed down we enter dragging mode and when it is released we leave dragging mode. Whenever the mouse moves in dragging mode, we track how far it moved and update the camera position, taking into account the zoom factor. Note that for mouse events the y coordinate increases down the screen, while our camera y increases up the screen.

One last thing to note is that when zooming out, we apply rounding to the camera position. This is so that every pixel matches up perfectly with a texel. When I didn't have this, sometimes if you zoomed in, panned around and then zoomed out you'd see odd pixels at the boundaries between tiles.

The other minor changes this time are to use mipmapping. This involves some changes to how we set up our texture atlas:

As I understand it, GL_GENERATE_MIPMAP is deprecated, but the alternative – the glGenerateMipmap function – isn't in PyOpenGL's OpenGL.GL module. I didn't really look into it much. I assume that it's either from a newer version of OpenGL than PyOpenGL supports, or it's in some obscure submodule.

I'm using GL_NEAREST_MIPMAP_NEAREST as the minification function. This means that we don't do any interpolation - we just pick the closest mipmap level and in that mipmap level pick the closest texel. This is fine because our zoom levels are powers of 2 and we keep the texels and pixels lined up. It also avoids some ways for things to go wrong.

Instead of using the texture function, we use textureGrad. This lets us give the GPU a hand in choosing a mipmap level. Without it, mipmap levels are chosed by looking at how much the texture coordinates are changing between one screen pixel and the next. This works fine in the middle of our tiles, but fails at the edges, because our texture coordinates jump suddenly between one tile and another. You can see in the image on the right inappropriate mipmap levels on the borders of tiles.

The extra parameters for textureGrad allow us to specify the rate of change of the texture coordinates per pixel both horizontally and vertically. We can calculate them based on the zoom level.

You can get the current version of the code from github.
Next week, we'll start looking at reading Minecraft maps. We'll be using an NBT parser for Python with some extra work to read the region files used by recent versions of Minecraft.

The fragment shader needs to calculate what colour each pixel should be. As input, it takes the texture coordinates that were output by the vertex shader. We have a single great big square with texture coordinate (0,0) in the bottom left and texture coordinate (1,1) in the top right. We want to texture this with a 512×512 grid of square textures drawn from our 16×16 grid of textures in our texture atlas.

The fragment shader needs to do two things with the texture coordinate. One is to look up our 512×512 texture that is our Minecraft map region - it can do that directly with no extra calculation. The other is to take that texture coordinate, consider it as a point in a grid of 512×512 cells, and figure out where it falls within its cell. We call this position theta. A theta of (0,0) corresponds to the bottom left of the cell, and (1,1) to the top right of the cell.

Given the sample taken from our map texture, we extract the red value to get a code between 0 and 255 that indicates a cell in our texture atlas. We call the position of that cell phi. The bottom-left cell in the texture atlas has phi=(0,0), the top-right cell has phi=(15, 15).

Finally, combining phi and theta identifies the position to sample from the texture atlas for the fragment we're rendering. Here's the new shader:

The complete source file is here:opengl2.py There are a few other changes. I've also tried enabled mip-mapping on the texture atlas, while limiting it to four levels, because beyond that the mip-maps would start blending together unrelated tiles. Unfortunately, I think I've still not quite got it working right, but that won't be immediately obvious. I might talk about it more in a later article, but it's going to need some work.

Here's the end result. (Click it to view it full size.) As you'll note, for now we've used the diagnostic texture for both the map texture and the texture atlas. On Friday we'll add some controls to zoom and pan so that we can actually see the whole thing and next week we'll start reading from Minecraft files.

2011-06-03

Last time we made ourselves a simple texture using Pygame. Today we're going to render it as-is on the screen using PyOpenGL. Simple, right?

This will probably be a little steep if you don't know any OpenGL. However, fear not. Read An intro to modern OpenGL by Joe Groff. It's great, and it's how I learned to write GLSL shaders. Well, I say learned to write them – I've pretty much only written the one that you're going to see in these articles. The tutorial's in C, but I've translated the early parts into Python: Python OpenGL tutorial Lastly, don't worry if the 3D maths is scary – we won't be needing it since we're sticking solidly to 2D. (That said, I am quite scared of the maths in the ambient occlusion bit. But we'll worry about that when we come to it.)

In addition to Python and Pygame, you're going to need PyOpenGL and NumPy. I installed them on Ubuntu using:

sudo apt-get install python-opengl
sudo apt-get install python-numpy

(This should all work fine on Windows and Mac, but you'll need to download and install all these things separately.)

Our square sits between (-1,-1,0) and (1,1,0) in world space. Screen space has (-1,-1) as the bottom-left of the screen and (1,1) as the top-right. If the texture is texture_dimension pixels across, and cam_position is a set of pixel coordinates in texture, this will position the square so that the given pixel is in the centre of the screen. We also use the original world space coordinates as the texture coordinates, but we multiply by 0.5 and add 0.5 because we want the texture coordinates to range from 0 to 1. So the bottom-left of our texture will have texture coordinates (0,0) and the top-right will have (1,1).

Given the simplicity of the shaders, rendering is a somewhat gruesome affair. We need to provide a value for each of the uniforms that we defined in the shaders, bind textures to slots, connect up and enable a vertex array on each attribute and finally draw our triangle strip.

And this is the output. Not thrilling, I'll admit, but I wanted to get as much of the tedious stuff out of the way to avoid confusing things too much when we do more with the shaders. You can download the two files from github.

About Me

I live in Glasgow, and when I'm not at work programming I'm often at home programming in Python for fun. I enjoy puzzles, algorithms, graphics and games. I've previously worked in video games and in a hi-fi company, and now I work for a travel search company.