Wednesday, March 21, 2012

Cycles internally has an implementation of the Ward anisotropic BRDF, but it has remained disabled. There weren't consistent tangent directions across polygon face borders, and so the shading would result in a very faceted appearance.

I've proposed a patch that re-enabled the anisotropic closure node, and adds the computation of consistent tangents across meshes. It requires the artist to lay out UVs on the mesh, however, or else it results in the object turning completely black.

The tangent machinery calculates tangents based on the change in position along the U direction of the UV layout, so if the artist deliberately creates a UV seam or break, that will also be reflected in the shading. Buyer beware: the UVs can allow you to carefully control the 'brushed' direction of the surface, but you'll have to strategically hide the seams if they are visually distracting.

Here is a quick test showing the results. The top-left sphere is just diffuse (for reference), and the top-right greenish sphere is the existing glossy specular, which is not anisotropic. The other sphere, box, torus, and Suzanne are all using the anisotropic closure, with some fairly simple UVs laid out by yours truly.

On my GTX 460 this took about 3:25 to render, with 1000 passes. I used the Uffizi lightprobe for the background, with HDR range and importance sampling (my other patches). The combination makes for some decent metal appearances.

If you look closely, you can spot a seam or two. The UV sphere is one I generated on my own with a python script (that outputs an OBJ), and I did so in order to very carefully control the UV layout at the poles of the sphere. Even with that care taken, there is a slight seam apparent at the top.

Edit: 7 months after submitting the change, now that blender 2.64 is released my patch was accepted into trunk. Brecht added a couple of extra goodies to make sure tangents can be gathered other ways, too. Thanks, Brecht!

Wednesday, March 7, 2012

I had proposed a patch for HDR texture sampling a ways back, and Brecht has committed it (with a few small modifications). I feel like there's a texture slot bug lurking in there that I may need to fix, but I'll have to verify that.

Essentially, the change allocates 5 of the 100 texture slots to be full floating point textures. Generally the data set there will be linear (and Brecht changed the monikers for those to be "color" vs. "non-color" instead of linear vs sRGB). Any source texture that is naturally a float format, such as EXR or 32-bit float TIFF, or HDR, will automatically get one of those slots.

A couple of things to keep in mind:

You've only got 5 full-float texture slots, so be judicious in your use of them. Any textures you don't want using the those slots need to be saved in a format that is not floating point on disk, and they will land in the regular texture slots instead.

Anyone who was using the 100 slots before for non-float textures, though, is in for a surprise; they've lost 5 regular (non-float) texture slots. Never fear, all you have to do is change some textures to a float format (such as EXR) and you'll get back to the 100 total slots that way.

The management is a little bit manual for now, but the long-term strategy for Cycles I think is that texture management will be done in a different way. Textures will be packed into layered versions, and the sampling of them will occur using a more manual sampling (that is slower but higher quality). This way the 100 texture limit will be eliminated, and better sampling like anisotropic kernels can be used, even on GPU.

Wednesday, January 25, 2012

One of the things I discovered while working on environment importance sampling is that Cycles encodes all of its textures to 32-bit byte-per-component RGBA format before uploading to the compute device. Essentially, this means all textures are low dynamic range.

This is fine for masks and reflectance maps (e.g. most surface textures), but won't cut the mustard for at least a couple of scenarios: HDR backgrounds/environments, and interesting compositing techniques within shader networks. You can imagine bringing in a map whose range is not 0.0 to 1.0, and then using it as a multiplier or other effect on shading for a surface shader. Or, you can imagine using an HDR texture for albedos in volumetric rendering, as those values go from above 0.0 to essentially infinity.

Long story short, Cycles needs the range, at least for some textures. The primary issue is that of memory usage; on GPUs with limited memory, taking a 32-bit RGBA texture to a float RGBA texture quadruples its memory usage. A single 2k x 2k texture currently takes 16 MB of RAM; stored as floats it will take 64 MB. When all you have is, say, 1 GB of RAM, 64 MB for a single texture is quite a big chunk of the budget.

There is an alternative format, which is half float; it takes obviously half the space, but doesn't have quite as good of fidelity. Honestly, it's probably good enough for texture sampling. The drawback is that, while GPUs natively support computations in half format natively, CPUs do not. It would likely end up being a performance penalty too large to ignore when CPU rendering.

An obvious answer is to only use float textures where it is needed. This approach is difficult because the texture slots for GPUs must be known beforehand, and on CUDA there is a 128 texture limit. Cycles uses the last 28 for internal purposes, so there are 100 to work with. Siphoning any portion of those off to become float textures impacts that aspect of the budget.

Suffice it to say, this'll be an interesting acrobatic exercise to see how to impact texture budgets the least, while providing the HDR texture benefit. I went ahead and switched all of the texture slots to float textures just to see the benefit on a small scene. Here is a comparison of the Grace Cathedral lightprobe lighting the same scene I used for the environment importance sampling testing, with the LDR (current Cycles) results first, both at 100 paths/pixel:

Low dynamic range (0.0 to 1.0 clamp) -- Grace Cathedral light probe

High dynamic range -- Grace Cathedral light probe

To say the lighting difference is dramatic is an understatement. There is particularly a very bright light almost directly up in the environment texture, and in the LDR range that is almost completely lost relative to the other bright regions of the environment. The HDR version preserves that, and produces much more natural lighting to boot. Note that the area of the map with the bright light is relatively small, and the environment-lit scene almost looks like it's lit with a point light! That is the advantage of HDR environments combined with importance sampling.

I have yet to figure out how to not blow texture budgets, but until I do, for anyone who wants a full-float texture environment, let me know and I can give you a patch. Just keep in mind the impact it has on RAM usage; you may have to switch to CPU rendering depending on your scene.

Sunday, January 15, 2012

I submitted a patch for environment importance sampling (and multiple importance sampling too) that should help reduce noise in scenes with any sort of complex environment them. It treats the environment as a light so that it participates in the direct lighting machinery, and it also adds an importance map to know where the bright spots on the image are, and favor those.

Here is a scene with the Grace Cathedral light probe as the environment, before and after. Click each image to get a full size, as you will see the noise better. This first set has 1000 paths/pixel, and the difference is noticeable:

Before - no environment light

After - with environment light

Even more dramatic is to see it with fewer paths/pixel. Here is the same, but with 100 paths/pixel:

Before - no environment light (100 paths/pixel)

After - with environment light (100 paths/pixel)

In order to ensure I had this right, I used a debug environment map that looks like this:

I just placed a sphere with smoothed normals in the scene, and ran with 400 paths/pixel. The difference is pretty striking:

Before - no environment light (debug scene, 400 paths/pixel)

After - with environment light (debug scene, 400 paths/pixel)

It took longer than expected because I was learning my way around the codebase, and there were a couple of noise issues that are subtle and hard to catch if you don't know to look for them. It didn't seem right, so I kept digging, and when I finally found the code that was wrong (one line of it!) everything fell into place and it looked great.