per-object post-processing

Hello OpenGL gurus,

Suppose I need to render the following scene:

Two cubes, one yellow, another red.
The red cube needs to 'glow' with red light, the yellow one does not glow.
The cubes are rotating around the common center of gravity.
The camera is positioned in such a way that when the red, glowing cube is close to the camera, it partially obstructs the yellow cube, and when the yellow cube is close to the camera, it partially obstructs the red, glowing one.

If not for the glow, the scene would be trivial to render. With the glow, I can see at least 2 ways of rendering it:

####### WAY 1 ###########

1. Render the yellow cube to the screen.
2. Compute where the red cube will end up on the screen (easy, we have the vertices +the model view matrix), so render it to an off-screen FBO just big enough (leave margins for the glow); make sure to save the Depths to a texture.
3. Post-process the FBO and make the glow.
4. Now the hard part: merge the FBO with the screen. We need to take into account the Depths (which we have stored in a texture) so looks like we need to do the following:
a) render a quad , textured with the FBO's color attachment.
b) set up the ModelView matrix appropriately ( we need to move the texture by some vector because we intentionally rendered the red cube to a smaller than the screen FBO in step 2 (for speed reasons!))
c) in the 'merging' fragment shader, we need to write the gl_FragDepth from FBO's Depth attachment texture (and not from FragCoord.z)

####### WAY2 ###########

1. Render both cubes to a off-screen FBO; set up stencil so that the unobstructed part of the red cube is marked with 1's.
2. Post-process the FBO so that the marked area gets blurred and blend this to make the glow
3. Blit the FBO to the screen

#######################

WAY 1 works, but major problem with it is speed, namely step 4c. Writing to gl_FragDepth in fragment shader disables the early z-test.

WAY 2 also kind of works, and looks like it should be much faster, but it does not give 100% correct results.
The problem is when the red cube is partially obstructed by the yellow one, pixels of the red cube that are close to the yellow one get 'yellowish' when we blur them, i.e. the closer, yellow cube 'creeps' into the glow.

I guess I could kind of remedy the above problem by, when I am blurring, stop blurring when the pixels I am reading suddenly decrease in Depth (means we just jumped from a further object to a closer one) but that would mean twice as many texture accesses when blurring (in addition to fetching the COLOR texture we need to keep fetching the DEPTH texture), and a conditional statement in the blurring fragment shader. I haven't tried, but I am not convinced it would be any faster than WAY 1, and even that wouldn't give 100% correct results (the red pixels close to the border with the yellow cube would be only influenced by the visible part of the red cube, rather than the whole (-blurRadius,+blurRadius) area so in this area the glow would not be 100% the same).

Would anyone have suggestions how to best implement such 'per-object post-processing' ?

That, sadly, does not work in one case: the red cube partially obstructing the yellow one.

The glow extends out of the red cube by 'blurRadius' in each direction; after doing step 2) , the pixels of the glow that extend out of the red cube would still have Depth=1.0; when you then do step 3) and render the yellow cube, which happens to be behind the red one, the yellow cube would cover that part of the glow while it obviously shouldn't.

I know one can detect when each cube obstructs the other and render the scene accordingly:

but this is only an idealized example. Real requirements are, of course, render N different Meshes, each much more complicated than a cube, and some of them post-processed in certain ways. This needs to be order-independent, i.e. the algorithm needs to have no knowledge whether the glowing Mesh you are rendering right now will be partially covered by some other Mesh or not, or if there will be some other Mesh partially behind it.

HWAY 2 also kind of works, and looks like it should be much faster, but it does not give 100% correct results.
The problem is when the red cube is partially obstructed by the yellow one, pixels of the red cube that are close to the yellow one get 'yellowish' when we blur them, i.e. the closer, yellow cube 'creeps' into the glow.

Ok, reading between the lines here I'm trying to infer your requirements. It sounds like you want:

1. A depth-honoring glow (not a screen-space overexposure HDR bleed type glow)
2. Your translucent "glow" to blend on top of the objects actually behind the it (and the object).

So something like a "2D highlight" icon that's displayed around the object to show that it's selected.

As translucent objects, you need to composite these glows in the right order. The objects they're highlighting are opaque (let's assume), and so no ordering restrictions with drawing them.

Taking more expensive/complex approaches like A-buffers off the table for a second, how about this for a straw-man proposal:

1. Clear color+depth
2. Render all your opaque objects with depth test/write
3. Sort all of your opaque objects furthest to nearest
4. Render all of them again, but 80% translucent and enlarged by 5% around their center, with depth test and alpha blend but not depth write
(alternatively, you could use some bounding box/sphere object instead if you wanted)

This is for the case where they're all "highlighted" (aka have glows). If only a subset, sort/render that subset in #3 and #4.

I know this has failure cases. But does this get you close? If not, please explain what requirements you have (possibly unstated) that it doesn't meet.

It sounds like you want:
1. A depth-honoring glow (not a screen-space overexposure HDR bleed type glow)
2. Your translucent "glow" to blend on top of the objects actually behind the it (and the object).
So something like a "2D highlight" icon that's displayed around the object to show that it's selected.

Yes, this is exactly right.

Of course, the initial 'red-yellow cubes' scene was just an idealized example (hopefully!) meant to show the essence of the problem.
What I am writing is a sort of OpenGL ES library for graphics effects. Clients are able to give it a series of instructions like 'take this Mesh, texture it with this, apply the following matrix transformations it its ModelView matrix, apply the following distortions to its vertices, the following set of fragment effects, render to the following Framebuffer'.

In my library, I already have what I call 'matrix effects' (modifying the Model View) 'vertex effects' (various vertex distortions) and 'fragment effects' (various changes of RGBA per-fragment).
Now I am trying to add what I call 'post-processing' effects, this 'GLOW' being the first of them. I define the effect and I vision it exactly as you described above.

The effects are applied to whole Meshes; thus now I need what I call 'per-object post-processing'.

The library is aimed mostly at kind of '2.5D' usages, like GPU-accelerated UIs in Mobile Apps, 2-2.5D games (think Candy Crush), etc. I doubt people will actually ever use it for any real 3D, large game.
So FPS, while always important, is a bit less crucial then usually.

Originally Posted by Dark Photon

1. Clear color+depth
2. Render all your opaque objects with depth test/write
3. Sort all of your opaque objects furthest to nearest
4. Render all of them again, but 80% translucent and enlarged by 5% around their center, with depth test and alpha blend but not depth write
(alternatively, you could use some bounding box/sphere object instead if you wanted)

This is for the case where they're all "highlighted" (aka have glows). If only a subset, sort/render that subset in #3 and #4.

I try really hard to keep the API 'Mesh-local', i.e. the rendering pipeline only knows about the current Mesh it is rendering. Main complaint about the above is that it has to be aware of the whole set me meshes we are going to render to a given Framebuffer. That being said, if 'mesh-locality' is impossible or cannot be done efficiently with post-processing effects, then I guess I'll have to give it up (and make my Tutorials more complicated).

Yesterday I was thinking about this:

Code :

# 'Almost-Mesh-local' algorithm for rendering N different Meshes, some of them glowing
Create FBO, attach texture the size of the screen to COLOR0, another texture 1/4 the size of the screen to COLOR1.
Enable DEPTH test, clear COLOR/DEPTH
FOREACH( glowing Mesh )
{
use MRT to render it to COLOR0 and COLOR1 in one go
}
Detach COLOR1, attach STENCIL texture
Set up STENCIL so that the test always passes and writes 1s when Depth test passes
Switch off DEPTH/COLOR writes
FOREACH( glowing Mesh )
{
enlarge it by N% (amount of GLOW needs to be modifiable!)
render to STENCIL // i.e. mark the future 'glow' regions with 1s in stencil
}
Set up STENCIL so that test always passes and writes 0 when Depth test passes
Switch on DEPTH/COLOR writes
FOREACH( not glowing Mesh )
{
render to COLOR0/STENCIL/DEPTH // now COLOR0 contains everything rendered, except for the GLOW. STENCIL marks the unobstructed glowing areas with 1s
}
Blur the COLOR1 texture with BLUR radius 'N'
Merge COLOR0 and COLOR1 to the screen in the following way:
IF ( STENCIL==0 ) take pixel from COLOR0
ELSE blend COLOR0 and COLOR1
END

This is not Mesh-local (we still need to be able to process all 'glowing' Meshes first) although I call it 'almost Mesh-local' because it differentiates between meshes only on the basis of the Effects being applied to them, and not which one is where or which obstructs which.

It also can have problems when two GLOWING Meshes obstruct each other (blend does not have to be done in the right order) although with the GLOW being half-transparent, I am hoping the final look will be more or less ok.

Looks like it can even be turned into a completely 'Mesh-local' algorithm by doing one giant

Code :

FOREACH(Mesh)
{
if( glowing )
{
}
else
{
}
}

although at a cost of having to attach and detach stuff from FBO and setting STENCILS differently at each loop iteration.