Pages

Nov 25, 2013

Using a fat buffer(16 bit-per-channel) is the most intuitive way to store a HDR render target, but it was too slow and it used too much memory, so we decided to encode(pack) it into a 8 bit-per-channel buffer. There are different packing methods, but I believe the most widely used method was RGBM. (also this link has a nice summary of LogLUV too, so give it a read if you are not familiar with this topic at all)

Packing.. Yum.... It's nice: it saves bandwidth and memory...

RGBM Limitations

But... YUCK we couldn't do one thing that artists absolutely love to see.... ALPHA BLENDING..

We simply can't do alpha blending on a RGBM buffer. In other words, you shouldn't have semi-transparent objects. Why? With RGBM, you need this formula to blend pixels properly.

DestMultiplier: stored in DestAlpha channel, which is easily available to our blend unit

SrcMultiplier: alpha channel output from shader

SrcAlpha: transparency value. but where do we store this? Usually in alpha channel. but with RGBM, alpha channel is used for the multiplier... eh... eh... oh yeah! there is something called "premultiplied alpha." by multiplying alpha value to SrcColor in shader, you can still get this working. yay?

InvSrcAlpha: but this is where we fail miserably. We don't have transparency value anymore, and there's no easy-n-efficient way to pre-multiply InvSrcAlpha to DestColour....... yes.... we are doomed.

Existing Solution

So what did we do to solve this problem? eh.. nothing... We simply bypassed this problem by introducing a separate off-screen buffer. We drew all semi-transparent objects onto this buffer with no HDR encoding, and merged the result onto the scene buffer later after tone mapping was done on the HDR(opaque) buffer

So basically you get HDR only on opaque objects and all semi transparent objects will be in LDR. But merging, or running an extra full-screen pass, was also slow, so we experimented with half-res or quarter-res approaches here to counter-balance the extra cost of keeping another render target.

Well, but it still use more memory than simply using only a RGBM buffer.

Blendable RGBM

So here I present my hack. Blendable RGBM. Here is a short summary.

it blends transparent objects in a mathematically correct way

so it doesn't use any extra memory

transparent pixels reuses the multipliers from the opaque pixels. which are already in the render target

but it often suffers from lower precision, so if there is a huge difference in range between the opaque and alpha pixels

you might see the bending effect (when alpha objects are darkening), and

the alpha objects might not be able to brighten pixels over what the opaque multiplier can allow

I said we are failing miserably because of InvSrcAlpha. And this was because we were filling shader's alpha output with SrcMultiplier. So what are we going to do about it? Just throw away SrcMultiplier... LOL.... Instead, we will simply match SrcMultiplier to DestMultiplier.. After this hackery hack, the formula becomes this:

After this change, the blending becomes very similar to the above formula with one exception: multiplying of SrcAlpha is missing. Since we are already setting DestAlpha as SrcBlend, we can't set SrcAlpha to the blending unit. The solution? Premultiplied alpha again. :-) Premultiply SrcAlpha to SrcColor in shader.

So are we good? no.. not really. With our current blending states setup, we get this:

DestColour * InvSrcAlpha + SrcColor * SrcAlpha * DestMultiplier

Doh! We are multiplying DestMultiplier!! But we have to divide it :( To fix this, we will change RGBM encoding and decoding a bit.

Changing RGBM Encoding/Decoding

This is essentially how "original" RGBM encoding works:

M: max(RGB)

encoding: RGB / M

decoding: RGB * M

alpha encoding: M / 6

To make it work with our new way, we simply reverse M, so that we multiply while encoding and divide while decoding. This will be something like this.

M: 1 / clamp( length(RGB), 1, 6 )

encoding: RGB * M

decoding: RGB / M

alpha encoding: M

The reason why I used length(RGB) instead of max(RGB) was to give it some extra room which alpha objects later can use to brighten the pixels. How I get M is very hacky. Maybe I can replace two 1s with much lower value like 0.125 to allow better precision in LDR. But keeping the min range to 1 guarantees that full LDR range is available for the alpha pass. But, I'm pretty sure someone can come up with a better way of calculating M here. :)

Finally New Blending Formula

With all the changes up to here, our new blending formula looks like this:

Oh, hello there! This is exactly what our blending render states setup gave us! Yay! it works!

Disabling Alpha Write

But wait.. if we output our transparency value in alpha channel, the multiplier in HDR buffer will be overwritten, right? We don't want this to happen. We need the alpha value only for blending, so we have to make sure this value is not written to the render target.

We can do it easily by masking alpha from color write. Another render state change will do the magic.

Additive Blending

My way also works with additive blending too. Set this blending states:

SrcBlend: DestAlpha

DestBlend: One

Caution

As I mentioned earlier, this approach suffers from two side effects.

When the transparent objects darkens pixels too much, you will see bending effects due to the lack of precision with this approach. This happens only when the dest pixel was in HDR range.

Using blendable RGBM would make sense only in certain conditions. I have seen those conditions personally. So use it only when you can't afford separate offscreen buffer for memory or performance reasons and when you are willing to sacrifice visual qualities in return. After all, it's a trade-off between visual quality and memory use.

p.s.

If you are working on newer consoles, HDR packing might not be needed anymore. But I'm pretty sure there are still a lot of developers who are stuck with older machines, so hopefully they will find this post helpful. :)