Realtime histogram?

So, like I imagine a lot of people, I've futzed a lot with fine tuning post-processing effects like bloom. There's probably smart ways to do it. But I've decided that it's a fool's errand, and want to be more smarter.

It occurred to me that if I could generate a real time histogram of my image, I could use that data to tune my bloom filter's strength in real time. So, when looking out of a dark cave and most of the scene is dark, the bloom effect could be ramped up. When in an open area, well lit, the bloom could be turned down. Etc.

Obviously, readPixels and CPU-based histogram is not the way to go. So I'm wondering what kind of options I have. My thoughts are to write a histogram shader that would write into a 256x1 1D texture, and have the post-processing sample from that texture and run some logic to decide how hard to apply the filter. But I haven't really thought too hard about this yet.

In generating bloom, one way is to build a pyramid of scaled bloom textures which are then composited back to the screen... the trick I once used was to grab the top-most smallest texture (if think it was about 32x32 in our case) back into main memory and find the average overall intensity from that, smooth the result over time, and then use it to modulate the bloom in the next frame.

So yes, I guess we were doing a CPU-based histogram of sorts, but because we are dealing with such a small texture it was cheap.

I think you could also do this using only a vertex shader, if you transform image values (fed as a texture, or attribute) to bucket locations. The problem there is you need to be able to accumulate into buckets, which is fine if you can blend to a >= 16 bit precision surface, but a lot of hardware can't do that.

I've been googling, and reading up on the available techniques... it's not looking good to me. For my platform ( x1600 ) and considering the strain I'm already putting on it, I don't think I have a lot of options outside of the CPU histogram of a severely downsampled image.

Now, a simplification to my situation would be a simple average luminance reading. If I could come up with the average luminance of a frame, I could use that to adjust the bloom levels dynamically. And I could probably compute average luminance by downsampling an image ( via several averaging passes ) to a single pixel, and use the luminance of that pixel...

At a cursory examination, that's almost exactly what I'm doing right now! But the trouble is that the strength of the effect is critical. I've got situations where a strong bloom is required, and others where that strong bloom overpowers the image.