Heat map visualization shader help please

I don't have a lot of shader programming experience, so I'm trying to do something that will expand my horizons.

Suppose you had a list of events at various world positions, with an influence radius, and suppose you had a 3d model of an environment. I'm trying to figure out how to best take this world event data and render it in 3d in a form that illustrates the usual heat map functionality, such as a radial falloff of influence around the events, and an accumulation of stacked influences for overlapping events, that ultimately result in a cool to warm color mapping based on the weight range.

I'm not sure if I'm totally trying the wrong stuff, but my current attempt at getting something working is by passing an array of event world positions into the shader that I render the world with and then I'm trying to figure out in my shader how to calculate the accurate world position of the fragment, so that I can basically calculate the accumulation of event 'weights' that should effect that particular pixel from all the events. I'm thinking I would need to write those accumulated results into a floating point FBO buffer and then render the scene again and mapping the accumulated values in that buffer to the cold to hot color gradiant that I want to see it as.

Anyways, I'm spinning my wheels at this first step. I don't really know how to calculate the world position of a fragment with which to use with the world position of the events passed in through a uniform array. I'm trying a dead simple test setup in order to see this working, by having an event in the middle of my map, and the color should blend from green to red for pixels within 500.0 world distance from the event, which should result in me seeing a color gradient 500 world units around my event position.

Anyone know how to calculate the world position of a fragment in the shader? My mesh doesn't have UVs, as it is a simple colorized debugging model of the navigation mesh of a game level.

Any help appreciated, both on my current attempt and if there is a better or alternative way to do this.

...list of events at various world positions, with an influence radius, ...3d model of an environment. ...render it in 3d in a form that illustrates the usual heat map functionality, ...cool to warm color mapping based on the weight range. ...passing an array of event world positions into the shader that I render the world with

Ok, so if I gather correctly, you want to take a 3D scalar intensity field generated by a set of 3D point emitters, sample it at the surfaces of the objects in your environment, and render that as a color value on the geometry, with the color value chosen from a 1D heat map gradient.

This sounds very similar to standard realtime rendering using point light sources (except for the heat map), so google that and you'll get tons of hits with example code (fragment lighting, vertex lighting, etc.). For instance, one of many here.

Lots of ways to fry that fish, but passing in an array of point data for the emitters when rendering the scene is probably reasonable if you want the shader to dynamically sample/compute the field value at sample points every redraw and the number of point emitters isn't huge. Of course, if you're doing an interactive/realtime vis of this (e.g. with user trackball to reorient/zoom the model) and computing the value of the scalar field is expensive or otherwise impractical for realtime, then it's probably better to precompute and possibly even presample the field before rendering ...but for now, we'll assume that computing the value of this scalar field is "cheap" and can be done at render time on the GPU, with either forward or deferred shading.

...and then I'm trying to figure out in my shader how to calculate the accurate world position of the fragment, so that I can basically calculate the accumulation of event 'weights' that should effect that particular pixel from all the events. ...Anyways, I'm spinning my wheels at this first step.

Ok. Couple things that might be useful to you here:

First, computing a WORLD-space position in the shader. If you're using a standard forward shading approach, where you're computing the color/radiance of your fragments as you are rasterizing the original polygonal geometry for scene objects that you want colored (or lit), then you don't need to compute the WORLD-space position of the fragment from complete scratch. You typically pass in an OBJECT-space position to your vertex shader, and if you also pass in a MODELING transform as a uniform, you can use that to take those positions to WORLD-space very simply (or more conventionally, pass in a MODELVIEW transform and take those positions to EYE-space). If you want this in the fragment shader, just pass WORLD (or EYE) space positions down to it via an interpolator. See the link I posted above for one example of this (search for ecPos).

On the other hand, if you were applying this to your scene in a deferred shading/rendering style, where the geometry is pre-rasterized into multiple screen-size buffers and long since forgotten, then you'd need to recompute the full vec3 fragment position since you likely only have a depth value for the fragment saved off. For that, you could use something like this. No need if you're using forward shading though.

On choosing forward or deferred: Where you might want to use a deferred rendering style approach is when you have "a lot" of point light sources (hundreds, thousands) potentially influencing your 3D scalar (or vector) field, and you want to avoid the complete waste of considering them all at every fragment rendered. Basically, it lets you take advantage of spatial locality fairly easily to avoid lots of wasted compute cycles for this case. There are other optimization strategies for the "lots of point sources" case as well (clustered deferred/forward, etc.)

(Also, regarding WORLD-space positions: you typically don't work with WORLD-space in your shader as that puts a small limit on how big your "world" can be, due to limited float precision; for this reason, often EYE-space is used for these computations instead, but if your world is small you don't care).

Second, there's the question of at what "rate" you sample your 3D scalar field. You could compute it at each vertex of your geometry and interpolate the results across polygons. Or you could compute it at each fragment (typically per-pixel, but could be per sub-pixel sample if supersampling is enabled).

For the former, in the vertex shader you could just take the OBJECT-space positions, map them to WORLD (or EYE) space, sample or compute your scalar field value, map that through a 1D lookup texture to give you the heat map color, and then store that in a vec3 interpolator to be passed to the fragment shader (the GPU would automatically interpolate the color value across triangles for you). The fragment shader would then just use that interpolated color value as its output color value. For the latter, your vertex shader would be very simple: You'd just compute the WORLD (or EYE) space position and store that in an interpolator for passing to the fragment shader. Then in the fragment shader, you basically take that and do everything else described above.

I'm thinking I would need to write those accumulated results into a floating point FBO buffer and then render the scene again and mapping the accumulated values in that buffer to the cold to hot color gradiant that I want to see it as.

Not necessarily. If the cost of computing values for the 3D scalar field is relatively cheap, then you could do this computation and accumulation in registers within each fragment shader (or vertex shader) execution and not need an intermediary buffer. GPUs have a ton of power nowadays -- you can get away with a lot here (that is, it may not look cheap but it could be "cheap enough").

If the cost of computing values for the 3D scalar field was expensive purely because you have a lot of point sources and forward shading is too slow for this case, a deferred rendering technique might make realtime computation at the surfaces of your visible objects possible, if that was desirable. Here's where you'd have FBO intermediaries which you can use to accumulate the total value of the field at the surfaces of your objects.

If however computing the values of the field was too expensive for realtime eval in any form, there are a number of ways you could precompute and potentially presample your 3D scalar field prior to rendering so that rendering is really realtime. For instance, you could precompute/presample the 3D scalar field to some resolution (on the CPU or GPU) and store it in a 3D texture (or 2D texture, for 2.5D scenarios), which would then be sampled and interpolated dynamically on the GPU during realtime rendering. Or, you could precompute/presample the field at the vertices of your geometry, store those values off on your geometry, and then rendering is almost mindless because all the work has been done. There are other options too, if sampling/computing your field values is too expensive for realtime.

My mesh doesn't have UVs, as it is a simple colorized debugging model of the navigation mesh of a game level.

Wow that's a lot of information. I appreciate your taking the time to respond in such detail. Unfortunately my familiarity with all the terminology of graphics rendering is not all that advanced. I'm an A.I. programmer by trade, which is why I'm trying to set up visualization on a navigation mesh built for an A.I. bot.

Couple things. First, computing a WORLD-space position in the shader. If you're using a standard forward shading approach, where you're computing the color/radiance of your fragments as you are rasterizing the original polygonal geometry for scene objects that you want colored (or lit), then you don't need to compute the WORLD-space position of the fragment from complete scratch. You typically pass in an OBJECT-space position to your vertex shader, and if you pass in a MODELING transform as a uniform, that'll take those positions to WORLD-space very simply (or more conventionally, pass in a MODELVIEW transform and take those positions to EYE-space). If you want this in the fragment shader, just pass WORLD (or EYE) space positions down to it via an interpolator.

I'm sorry, but I'm still newbie enough to the graphics programming lingo that I don't fully understand the big picture here. Specifically I'm still fuzzy on the various spaces that the different shader stage operates on. From what I've learned(and hopefully have a correct understanding), the vertex shader will calculate vertex positions and will interpolate the position value for the shader stage of that geometry. What space are those vertex positions in within the shader, model space? Are you saying that I could calculate the world space in the vertex shader into a varying variable, which I think will interpolate it such that the fragment shader would get the world position based on the interpolation? And are you saying that the depth based calculations like you linked to are for situations where you are beyond that point in the rendering pipeline? I've spent a day or so tinkering with some of those functions and haven't been able to get them working, probably from my lack of understanding. I've been passing in a depth of

Code :

float depth = gl_FragCoord.z / gl_FragCoord.w;

Most of the uses appear to pull from the depth buffer, which I think I understand as being a deferred rendering implementation since it needs to reconstruct the position from the depth buffer and pixel only. It sounds like you are saying that isn't necessary for my desired use case, as I would be doing the calculations directly in the fragment shader of the geometry I'm colorizing. Do I have that right?

I don't think calculating the data at the vertex level is sufficiently detailed. With the mesh being such low poly for navigational purposes, the influence of the events may be only a small radius inside a larger polygon. I would like to be able to visualize the fine grain event distribution within a large room that may only be represented as a rectangle via 2 triangles.

Not to complicate things before I have the basics down, but the reason I mention a floating point FBO is that I also have a longer term goal of wanting the application to normalize the height map automatically. For example, there could be hundreds of events that overlap a small area, and they must be allowed to accumulate arbitrarily to a large value. The hope would then be to somehow be able to look at the maximum and minimum values somehow and then the fragment shader would use that range in order to perform its colorization. This is a stretch goal, and at first the range will probably be user defined via sliders or a simple GUI or something. Someone said in another forum this might be possible by rendering the entire map to a buffer, and then recursively rendering that frame buffer at half size with a shader that writes as a pixel value, the min/max weight of the 4 pixels around it, effectively propagating the min/max weighting up to a 1x1 texture that I can do a getpixel or whatever in order to get the weight range automatically from the rendered scene. It sounds rather complex and not something I want to worry about just yet.

Just to give more context, I'm trying to get my feet wet in some graphics and shader stuff in order to make a heat map visualizer for game analytics. Sorta like you might have seen pretty often in a game context or a variety of other contexts.

Here is an example of a map. It will generally be very low poly, probably only a few thousand triangles in most cases. In this case the entire level is less than 800 triangles, though it is one of the smaller ones. If you are familiar with team fortress games, this is a navigation mesh from 2fort.

I'm not sure if this is a reasonable expectation, especially since I don't really see any examples of people doing heatmaps in 3d realtime, but there will be 'events' in the thousands, maybe even tens of thousands for long game matches for events such as shots being fired and such. I think the majority of cases the data will be visualized from a gods eye view where most of the data will be in play at a given time. Ideally I would like the viewer to be able to visualize the events in real time, essentially as they come in, such as if the viewer is connected to the game via network. If that is too infeasible the alternative is to take a file dump of the data set and load, preprocess, and then be able to fly around it to look at the information. I show the side view of the mesh to show that there is sufficient enough overlap in the map geometry that a 2d heatmap is far less useful than a 3d one, though far easier to implement.

From the research I have done and the people I have talked to, some of the implementation possibilities tread on deferred rendering and/or treating the events as lights somehow, but as there may be thousands of them, I get concerned whether those approaches are viable, since they may all be visible most of the time just by the nature of the visualization. When I learned about passing data into the shaders in the form of uniform vectors, I immediately thought that it would be easy to pass the relevant event data into the shader that way in the form of something like this.

Code :

uniform int eventCount;
uniform vec4 events[32]; // this would get much bigger at some point obviously
uniform float eventWeight[32]; // the strength of the event, additively added to other events of similar type to accumulate arbitrarily
uniform float eventRadius[32]; // the world distance radius of the event, through which the weight reduces to 0

Then, knowing that world space information in the shader, I was hoping that my fragment shader could basically do something like this pseudocode

Code :

float fragmentWeightAccum = 0.0;
for( i = 0; i < eventCount; i++ )
{
fragmentWeightAccum += CalculateEventEffectOnWorldPosition( fragmentWorldPos, i );
}
gl_FragColor = MapWeightingToColorGradient( fragmentWeightAccum ); // probably by mapping it to a user defined min/max weighting, maybe eventually a rendering trick could provide back the max weighting from all the event blending it performs on the GPU.

I think this is what you mean by calculating the values in the registers of the shader, and not requiring an FBO.

I'm not sure how scalable this would be in terms of event count performance falloff, but it seemed simple enough to try at least. In my day or two of trying to figure out how to get the world fragment position though I've been mostly confused by the various 'spaces' discussed in the various threads I've come across.

Again I appreciate your time and wisdom. I intend to share this publicly as an analytic viewer when it reaches a usable point.

I'm sorry, but I'm still newbie enough to the graphics programming lingo that I don't fully understand the big picture here. Specifically I'm still fuzzy on the various spaces that the different shader stage operates on.

OBJECT coordinates are typically what you feed the GPU (aka model coordinates). MODELVIEW is a product of two transforms: the MODELING transform, which takes OBJECT-space to WORLD-space, and VIEWING which takes WORLD-space to EYE-space.

The OpenGL Programming Guide has a good chapter named "Viewing" IIRC which describes the transforms if you want more detail. If you have a specific question, feel free to post.

From what I've learned(and hopefully have a correct understanding), the vertex shader will calculate vertex positions and will interpolate the position value for the shader stage of that geometry. What space are those vertex positions in within the shader, model space?

The pipeline is flexible, so there lots of other options here, but most frequently you feed vertex positions into your vertex shader through a vertex attribute populated outside of the GPU program on the CPU. You can put these input positions in whatever space you want (since you're writing the vertex shader), but most commonly these are in the OBJECT-space of the model. Via a vertex shader output, the GPU needs to be provided these positions in CLIP-space (see diagram above), so all that's strictly required is you transform these input position to clip space via the MODELING*VIEWING*PROJECTION matrix aka ModelViewProj.

If you instead wanted to feed world-space positions into your vertex shader you could. In this case, your MODELING transform would just be the identity.

Now, for whatever shader you use to compute point event influences, you're probably going to want these positions in WORLD or EYE space. And to get that, you just multiply your input OBJECT-space positions by the MODELING transform or MODELVIEW transform, respectively.

Are you saying that I could calculate the world space in the vertex shader into a varying variable, which I think will interpolate it such that the fragment shader would get the world position based on the interpolation? And are you saying that the depth based calculations like you linked to are for situations where you are beyond that point in the rendering pipeline?

Yes, and yes. You've got it.

...Most of the uses appear to pull from the depth buffer, which I think I understand as being a deferred rendering implementation since it needs to reconstruct the position from the depth buffer and pixel only. It sounds like you are saying that isn't necessary for my desired use case, as I would be doing the calculations directly in the fragment shader of the geometry I'm colorizing. Do I have that right?

I don't think calculating the data at the vertex level is sufficiently detailed. With the mesh being such low poly for navigational purposes, the influence of the events may be only a small radius inside a larger polygon. I would like to be able to visualize the fine grain event distribution within a large room that may only be represented as a rectangle via 2 triangles.

Gotcha. So sampling at the vertices and interpolating is out.

Not to complicate things before I have the basics down, but the reason I mention a floating point FBO is that I also have a longer term goal of wanting the application to normalize the height map automatically. For example, there could be hundreds of events that overlap a small area, and they must be allowed to accumulate arbitrarily to a large value. The hope would then be to somehow be able to look at the maximum and minimum values somehow and then the fragment shader would use that range in order to perform its colorization. ...there will be 'events' in the thousands, maybe even tens of thousands for long game matches for events such as shots being fired and such.

I see. That makes sense. Also, this mention of "hundreds" to "tens of thousands" of point influences really casts doubt that computing the scalar field computation directly in the shader while rasterizing the mesh is going to be fast enough at the fragment level.

I think the majority of cases the data will be visualized from a gods eye view where most of the data will be in play at a given time. Ideally I would like the viewer to be able to visualize the events in real time, essentially as they come in, such as if the viewer is connected to the game via network. If that is too infeasible the alternative is to take a file dump of the data set and load, preprocess, and then be able to fly around it to look at the information. I show the side view of the mesh to show that there is sufficient enough overlap in the map geometry that a 2d heatmap is far less useful than a 3d one, though far easier to implement.

Just to clarify, is the heatmap you want to render only for values sampled on the 2D mesh (which itself is probably 2.5D). Or do you actually want to render a volumetric field?

From the research I have done and the people I have talked to, some of the implementation possibilities tread on deferred rendering and/or treating the events as lights somehow, but as there may be thousands of them, I get concerned whether those approaches are viable, since they may all be visible most of the time just by the nature of the visualization.

It just depends on your needs (more on this below). Thing is, when you get into the hundreds or thousands of influences, you don't want to be computing the influence of each of these for every single pixel on the screen if you don't have to (when you're aiming for realtime or at least interactive performance that is). Frequently if you have this many on the screen, the area of influence of most items is relatively small. Deferred just takes advantage of that to speed things up.

When I learned about passing data into the shaders in the form of uniform vectors, I immediately thought that it would be easy to pass the relevant event data into the shader that way in the form of something like this.

Code :

uniform int eventCount;
uniform vec4 events[32]; // this would get much bigger at some point obviously
uniform float eventWeight[32]; // the strength of the event, additively added to other events of similar type to accumulate arbitrarily
uniform float eventRadius[32]; // the world distance radius of the event, through which the weight reduces to 0

That's sure what I'd start with, just to get some first renderings up. The issue you run into is there's a limit on the amount of uniform space you can pass into a shader, so at some point you end up needing to shift your tech approach a bit.

Then, knowing that world space information in the shader, I was hoping that my fragment shader could basically do something like this pseudocode

Code :

float fragmentWeightAccum = 0.0;
for( i = 0; i < eventCount; i++ )
{
fragmentWeightAccum += CalculateEventEffectOnWorldPosition( fragmentWorldPos, i );
}
gl_FragColor = MapWeightingToColorGradient( fragmentWeightAccum ); // probably by mapping it to a user defined min/max weighting, maybe eventually a rendering trick could provide back the max weighting from all the event blending it performs on the GPU.

I think this is what you mean by calculating the values in the registers of the shader, and not requiring an FBO.

Yes, and that's simplest to start with (I would). But with your goal of thousands to tens of thousands, you'll probably have to shift approaches as you scale this up.

eureka!

So hey, armed with a bit better understanding from your first post I managed to get something working rather quickly. I think I got hung up on search terms yesterday that I was chasing the wrong type of implementation. Thanks again for putting me back on track.

Do you know what the max uniform limit is? Will this method scale up to thousands or tens of thousands of events? If not, would a reasonable performant alternative be to 'encode' the events into a large floating point RGBA texture? Unless I reduce the data I get, it may mean using 2 pixel values per event, as I am trying to have per event x,y,z,radius,weightmin,weightmax. Maybe eventually put some sort of type identifier as a filtering mechanism. That's less ideal of a use for a float value but I guess it could work.

You can manipulate the weight max in the parameter list to get a real time colorize adjustment. Pretty useful for being able to visualize subtle areas of weighting that the large accumulations end up washing out. Would still be nice to be able to algorithmicly figure out the min/max value within which to clamp the manual adjustment or let it auto adjust.

I need to figure out how to get some basic white light shading in there so everything doesn't look so uniform and flat. Know offhand a simple shading adjustment I can add to the shaders in order to essentially hard code a top down directional white light so I can tell depth and layers apart? Thanks.

Yes, and that's simplest to start with (I would). But with your goal of thousands to tens of thousands, you'll probably have to shift approaches as you scale this up.

Is the more scaleable approach deferred rendering or is there other alternatives? I am guessing that the costly part of doing it this way, assuming I could get the larger data sets into the shader somehow, like storing them in a big float texture or something, is that each pixel of the rendered object will be looping sequentially through a potentially big data set in order to accumulate its weight information. Even though that data is cache friendly in how it is being searched, it is still touching a lot of data.

Perhaps I could set up some form of grid partitioning where the world space of the pixel could index into a texture and somehow get a far reduced set of data to go through.

Maybe a 2d texture or coarse 3d that is effectively treated as an occupancy grid that is mapped to a texture sized to the dimensions of the world and all the events rendered into the event as a simple black or white, such that the heatmap shader can early out of doing any search at all if it is a pixel that has no event overlap.

Or perhaps there is some sort of trick where I can render simple quads that represent the events in a way that reduces the expensive fragment work to only areas where events exist, and there isn't a bunch of pixels that end up with no influence needing to run through a bunch of events only to end up with nothing.

If varies by card, but on a relatively recent GPU (GTX580), the max amount of ordinary uniform space for a frag shader is about 2048 32-bit floats (MAX_FRAGMENT_UNIFORM_COMPONENTS), so depending on the amount of space per point event, we're talking dozens to hundreds assuming there aren't any other big consumers of uniform space. Past that you could shift to storing in uniform buffer objects, where you get ~14 binding points each of which can hold ~64KB. And you can go to texture or image data past that to further exceed that limit (I would actually go straight to texture and skip UBOs myself).

But I suspect before you even get to that point with the space issue, you'll find you want another approach just due to time consumption applying all these point sources to every fragment on the screen.

Will this method scale up to thousands or tens of thousands of events?

That depends on your frame rate requirements, target GPU, and complexity of your weight computation. But I'd push your existing technique as far as you can until you know you need a plan B. Then you "know" you need it.

Would still be nice to be able to algorithmicly figure out the min/max value within which to clamp the manual adjustment or let it auto adjust.

You can definitely do that as a post-process. Ping-pong reduction as you described before on the GPU, or for starters just do a CPU readback of the resulting accumulated weights and reduce there (i.e. compute min/max).

I need to figure out how to get some basic white light shading in there so everything doesn't look so uniform and flat. Know offhand a simple shading adjustment I can add to the shaders in order to essentially hard code a top down directional white light so I can tell depth and layers apart?

Just mixing in an dot( normal, lightvector ) term in there, attenuated to-taste, will get you a long way.

Ok so I moved the event data into a floating point texture so I could crank up the numbers. Couple oddities.

I would have thought that colorization via a fragment shader would not be able to z-fight, but you can see some z fighting when zoomed out. Is there something that can cause z fighting like this with shader work?

Here is a closer view

Secondly, it appears to be fill rate limited, as scaling the window down or up affects the performance significantly. I basically expect this due to the complexity of the shader at the moment, but the part I didn't expect is that this performance is also reflected in CPU usage in task manager. I would have thought fill rate limitations would be on the GPU side. Even with the render calls being blocking calls I guess I would expect the program to basically block, and not be reflected in terms of CPU usage.