Just wanted to say I found inconsistent results when using autolevels with mvtools. Using our test video (the PetitDragon one) it seems to improve as many frames as it worsens.
It's a great plugin for other purposes though, great job Jim

Just wanted to say I found inconsistent results when using autolevels with mvtools. Using our test video (the PetitDragon one) it seems to improve as many frames as it worsens.
It's a great plugin for other purposes though, great job Jim

Although I have no doubt that autolevels isn't guaranteed to only ever improve a clip (now wouldn't that be a great plugin), it would help to know what were the specific problems.

One common one is when a part of a scene should remain dark, autolevels will attempt to brighten it, which both affects the mood of the scene and results in a crappy, noisy picture.

Another common problem is too much clipping. The defaults for autolevel are set to match what it was with version 0.3, when I inherited it, which ignores the bottom 0.4% and top 0.4% of all pixels, pushing all of them into the clipping range. This is typically too much, and can often be fixed by setting it to something smaller like 0.1% or 0.05%.

For both of the above situations I have some ideas for heuristics which would improve the situation, but it doesn't seem possible to avoid all possible pitfalls.

Are you seeing some other problems? Are you using it on a yuv or rgb clip?

Another common problem is too much clipping. The defaults for autolevel are set to match what it was with version 0.3, when I inherited it, which ignores the bottom 0.4% and top 0.4% of all pixels, pushing all of them into the clipping range. This is typically too much, and can often be fixed by setting it to something smaller like 0.1% or 0.05%.

For both of the above situations I have some ideas for heuristics which would improve the situation, but it doesn't seem possible to avoid all possible pitfalls.

You've mentioned this several times in this thread, and to deal with it you added the "ignore" parameters. This greatly improved the usefulness of this filter. However, you've asked several times prior to this post for ideas for another heuristic to help cope with this fundamental problem, so here are a couple of ideas. Everything that follows, except for the last paragraph, focuses on calculations made within a frame (as opposed to the averaging that you do across "n" frames).

It seems to me that you need to introduce the additional concept of pixel grouping. Starting with a pathological case, under almost all circumstances I can think of, pure white single dots should be ignored when doing the levels calculation. To take the simple case, if a dot is 255 and all immediately adjacent pixels are a value "x" less than this, then the dot should be thrown out of all calculations.

This extreme case seems fairly straightforward (as are most extreme cases). As the number of adjacent pixels that are as bright or brighter (or as dark or darker) increases, more "thought" should be given to including this group of pixels in the calculations.

With this concept, the larger the grouping of spots that exceeds the upper or lower limits (ignore_high, ignore_low), the more likely your algorithm should include them in the calculations, even if the threshold for total number of pixels beyond the boundary hasn't been exceeded.

One other heuristic would be to add the concept of an outlier. You obviously know the statistical definition. The visual problem I'm trying to define is where you have something like a specular reflection from a window pane, something like you'd get if you were videotaping out the front of your car, and the sun hits the rear window or chrome bumper of the car in front of you. This would give you a big area of pure white, and the auto-exposure in the camera would reduce the overall exposure, but you would sill have the huge number of pure white pixels. The histogram of the result would have this huge number of pixels at or around 255, but then a huge gap between those and the rest of the histogram. Situations like this might be detected by looking for large gaps in the upper and lower boundaries of the histogram, with virtually no pixels at all between this big spike of pixels at the very end of the histogram, and the point at which the normal tail of the histogram resumes. If this additional heuristic existed, such video would still be automatically corrected, even though a very large percentage of pixels were nearly pure white.

The following really bad ASCII art is attempt to show this. My suggestion is that the "hump" on the right side should be ignored (the real-life example would probably have the grouping on the right much closer together so it looks more like a single spike). The length of the "dead zone" between the two humps would be the "tip off" that the upper pixels are bogus.

So here's the heuristic: the narrower the right-hand hump; the closer that hump is to 255; and the bigger the gap of almost no pixels between that hump and the tail of the main distribution, the more aggressively the algorithm should ignore all the upper pixels.

There is also the issue of a one-frame departures from average, such as you get when someone uses a camera flash. Such frames should not be part of the moving average. However, I assume you already do this.

John, thanks for the input. I had some time this weekend to think a little about autolevels, and one of the ideas is along the lines of your first comment -- a clustered group of N luma=255 pixels should matter more than N single pixels with luma=255 which are scattered around. My thought was to pre-process each frame with something like a median filter with a small radius (maybe just 1) to get rid of those specs. Perhaps one of the more sophisticated speckle removal filters would be enough. Stats would be taken on this munged frame and applied to the original frame pixels.

Warning: rambling thoughts ahead.

Another fundamental problem is this: the statistics take the mean histogram high/low values in a window +/-N frames of the current frame (ignoring frames judged to be from a different scene). The problem is that if the histogram changes suddenly, say due to the scene panning and something glaring in the sun pops into frame, this bright area will be visible in N+1 frames or fewer but it is averaged with 2N+1 frames by the time that bright spot becomes the center of the averaging window. To be more specific, say maxluma=200 for a good long time, then this bright object suddenly appears and stays in frame, and shows up as a spike near luma=240. maxluma averaged over the 2N+1 frame window will be (N*200 + (N+1)*240), or roughly 220 for larger values of N. This will cause all the pixels with luma > 220 to get saturated, wiping out all detail in the bright regions.

I think I should take this into account and do a smooth limiting function when calculating the average maxluma. Doing something simple like avg_maxluma = min( sum(maxluma(frame=current-N to current+N)), max(maxluma(frame=current-N to current+N)) ) would fail -- the first time the bright frame appeared in the averaging window would cause a sudden drop in brightness. Instead, it should have an influence based on its distance from the center of the window.

I've downloaded and perused a number of PDFs of people thinking about this same problem. One takes a very sensible approach of figuring out minluma and maxluma for each scene, then the correction for each frame is based on a combination of the minluma & maxluma of the current frame and the current scene. This algorithm isn't directly suitable for an avisynth filter, as a literal implementation would require searching forward and backward potentially for an unbounded number of frames until a scene boundary is detected. In real life, scenes lengths are typically less than a minute long, but that can still be more than 1000 frames. It might make sense to approximate the algorithm by having wide "scene window" and a smaller averaging window like currently exists.

Another failure mode of autolevels() which I had never thought about is the problem of fades and blended transitions. I was unaware of this because all of my work has been on 8mm home movies, and every cut is a jump cut. I think it would be relatively easy to look for a smooth transition to/from near black and to preserve it rather than attempting to boost those dark frames. Having that larger "scene" window would make it easy to see such transitions coming.

I had also long thought of some ways to improve autolevels' scene boundary detection logic using just the already-gathered histogram data, but before charging off to try it, I did a google search. It turns out that detecting scene boundaries is a ripe research area. Google on "shot boundary detection" and you'll find dozens of papers. There is even an annual shoot-out where researchers pit their latest algorithms against a battery of videos to see which works best. Many of these algorithms are not suitable for avisynth for the same reason stated before: they assume that the entire video will be processed in order, perhaps with two passes. avisynth filters can do this, but even with caching, the first frame out of the filter may take many seconds/minutes to produce.

One final observation -- none of the photo and video equalization algorithms operate in YUV space -- they either use RGB or better yet, HSV. It wouldn't be hard to change autolevels to convert from YUV to HSV, make adjustments, then convert back, but it would be performance hit and I know for a great number of avisynth users, speed is more important than quality (well, quality matters, but not if it drops processing to sub-realtime).

The problem is that if the histogram changes suddenly, say due to the scene panning and something glaring in the sun pops into frame, this bright area will be visible in N+1 frames or fewer but it is averaged with 2N+1 frames by the time that bright spot becomes the center of the averaging window.

To me, the heart of a better autolevels algorithm is dealing with these anomalies. Obviously, you first need to be able to correct a single frame and, to do this, you need to decide what constitutes correct levels. The original autolevels code did this and I think you have taken that part of the code forward. As I understand it, you basically apply a gamma function so the midtones are corrected more than the darkest and lightest part of the picture. To me, this is the toughest part of the whole thing and would be the place I would rely on other people's research. There is more art than science in figuring this out and doing it well.

However, I don't think I would look to averaging to provide a better fix to an individual frame's exposure. Averaging should instead, I think, be viewed ONLY as a way to avoid "hunting" and "pumping" problems, and not as a way to get a better exposure. After all, the sun may not just "pop" into the frame, but might be part of the scene for many minutes. If you are relying on averaging to make a proper correction, then in this example you'll never get anything different with which to average.

Quote:

Originally Posted by frustum

One takes a very sensible approach of figuring out minluma and maxluma for each scene, then the correction for each frame is based on a combination of the minluma & maxluma of the current frame and the current scene.

It is definitely true that some footage does exhibit the "shifted histogram" where the darkest pixels are way too bright or the brightest pixels too dark. Even when this occurs, you still need to truncate some outliers because there always seems to be a 0 or 255 out there somewhere, even when the exposure is clearly wrong. So simply using maxluma and minluma I think will lead to bad results, but using a slightly modified version, where you first throw out some outliers, might work. Concentrating on these values rather than using some sort of average would certainly stop the algorithm from trying to correct a sunset where the average luma is low, and there is almost nothing in the center of the histogram.

Once you correct the frame, I think I would use some sort of averageluma function to determine how to average the exposure. I actually think that you want to keep the number of frames averaged quite small, however. I can see no good coming from averaging more than about one or two seconds of video. Imagine standing in a forest at the edge of a lake. You start pointing the camera to the left, into the forest. The video is very dark, but the camera auto exposure turns up the gain so it isn't too dark. It still needs to be made lighter. The camera then pans across the lake, where the sun is hitting the water. The camera's autoexposure tracks the change, although probably with a lag. When the camera pan leaves the lake and points to the forest on your right, everything gets dark again. You don't want to dampen the corrections too much (by averaging LOTS of frames), or you'll end up not correcting anything.

I think you need to actually write down and clearly answer the question: "why am I averaging frames?" Make sure the objectives are clear, and that averaging is the best answer to the problem you are trying to solve.

So, I think if I were doing this, I'd spend most of my time trying to come up with a fantastic single frame exposure (levels) correction, and then at the end apply some "safety valves" that keep the correction from failing when subjected to specular reflections, scene changes, bright camera strobe flashes, and other short duration transient phenomenon.

Quote:

Originally Posted by frustum

Another failure mode of autolevels() which I had never thought about is the problem of fades and blended transitions. I was unaware of this because all of my work has been on 8mm home movies, and every cut is a jump cut.

I don't see why you have to consider this. I do film transfers all the time, but also edit lots of video. Fades and blended transitions don't happen in unedited video (I guess you can do them "in camera," but no one ever uses those features). Once footage has been edited, it then contains these transitions, but that is true of edited 8mm, Super 8, 16mm, 35mm, & other film formats as well. I would not bother to try to deal with this.

Quote:

Originally Posted by frustum

I had also long thought of some ways to improve autolevels' scene boundary detection logic using just the already-gathered histogram data, but before charging off to try it, ...

I've written several scene detection scripts and, while they all fail sometimes, they generally work well enough to be useful. A simple

(This code is actually a fragment taken from a WriteIf statement so its syntax may be slightly screwed up.)

You can also use the scene change logic in MVTools2. This is actually faster, if you have the MT version of AVISynth, than is the script code above. I can't remember, but I think this code is lifted more or less intact from the MVTools2 doc:

Finally, FWIW, here are the failures I have seen most often in both video and still photo "autolevel" algorithms.

1. Scenes which are naturally supposed to be dark, like sunsets, get gained up too much.

2. Bright objects, like someone walking directly in front of the video light on a camera-mounted light, cause the main video to go pitch black.

3. The autolevels algorithm attempts to always stretch the histogram to the limits, so at least one pixel is 255 and one pixel is 0. This often results in unnatural, high contrast.

4. The midtones are not stretched enough.

The last one is the toughest. Under- or over-exposed photos and video almost always compress the midtones (a necessary by-product of not using the whole exposure range). Simply moving those pixels up or down doesn't really solve the problem. Instead, the normal "hump" that represents the center range of exposure not only needs to be moved up or down, but also stretched out. I do this with custom gamma curves when I do my own corrections, but it is tricky stuff because you end up solarizing the image if you get the slope of the curves too steep.

Last edited by johnmeyer; 25th April 2011 at 16:10.
Reason: Changed a few word errors, just after posting

Feature sugestion: possibility to make the estimatives from a separate clip, and apply in the original clip. Maybe even from a separate image. The original use I though for this feature was made obsolete by the "border" parameter. But still, it would be interesting to have.

EDIT: The output of version 0.6 is too saturated for my taste (see atachment). Can I control the degree of saturation correction that this new version aplies? I Haven't found any way, and as this saturation correction is dinamic, I can't just fix it aftewards...

EDIT2: Puttting an tweak(sat=0.65) before the autolevels() seems to do the trick.

@johnmeyer A while ago I tried to contact frustrum through email, but he was very busy. I really wanted him to put a feature to be able to limit the amount of stretching, which I found to be the biggest problem in most videos. I still hope he will find some time again on this very nice plugin. The settings I use now is Autolevels(border=12,filterRadius=4,sceneChgThresh=255,ignore_low=0.0002,ignore_high=0.0001). I dont know why but it seems filterradius eats lots of memory? more frames=more buffer?

*** I'm editing this post because the way I thought Aoutolevels() works was completely wrong. What follows is the corrected reply ***

Thanks Fred!

But Autolevels() just sets the max value according to Output_high and then expands the image accordingly. This does save some details, but also reduces overall brightness throughout the video ... So is there a way to affect the bright parts only ??

Thanks a lot Didée!
Adding AddBorders did help in reducing the clipping, But unfortunately it also affected dark scenes, leaving them almost unchanged in comparison to not using AutoLevels at all. Problem solved when setting AutoLevels' Ignore_High parameter to 0.0001.
...Then I realized that Ignore_High alone was enough for the result I was aiming for .

I played around with autolevels and studied the documentation for it. Hoever, I cannot figure out what exactly the "ignore" parameter does. I understand that it should influence the statictics and what pixel values will be taken into account for the calculation of the histo-strech. But what does this value say?

Let's say I like to tell the filter to use only luma values that appear more than in 10 pixels of the complete image. In my example the lower and upper tail of the luma histogram holds some of these bars. How to tell the filter now where to start/clip the histogram? What happens if I enter 0.01 into this parameter?

I've been away from avisynth for a long time, and as such, haven't even read many of the recent comments. I have scanned them, but I need to reread more of the whole stream and think about it. In a side email to videoFred about autolevels, I wrote the following, but by the end decided to ask everyone their thoughts.

Stoffal, "ignore" ignores the specified fraction of highest and lowest luma pixels of the scene before taking statistics and deciding how much to adjust luma. Just like in sporting events where they toss out the highest and lowest results from the panel of judges before averaging the rest, "ignore" is a way to prevent a few outliers from throwing off the processing for the entire frame.

I'll probably spend some time trying to improve autolevels in the coming weeks. In the original autolevels, the Y channel would get modified, but the UV channels would be untouched. However, mathematically, that isn't right. UV really isn't an independent color space, just a linear combination of RGB. To make it concrete, if you have these two rgb colors:

$804040
$C06060

both should have the same color, but the second one is 50% brighter (ignoring any gamma which might happen later) as r,g,b are each scaled by 1.5. But the YUV equivalents are

You can see that the cr and cb components are affected by scaling RGB, even though they are different brightnesses of the same hue.

So, even though it seems mathematically correct to scale the UV as the brightness changes, I do see that sometimes it makes slight color shifts and magnifies them especially when autolevels is stretching the histogram a lot, and people are objecting to that. But if I don't scale at all, as things are made brighter, they will lose saturation, and things which are made darker will gain saturation.

Do you have any idea of how to deal with this?

For instance, I could add a chroma_strength parameter; set it to 0 and the UV values aren't changed at all, and would be the default so that we have the same behavior as the original autolevels. If you set chroma_strength to 1.0, then you get the current 0.6 autolevels behavior where things are scaled according to what the math suggests is correct. If you set it in between, then you get the weighted average between these two cases.

And/Or I could add chroma_scale_limit, and perhaps a luma_scale_limit, as parameters. It would prevent autolevels from scaling UV and Y, respectively, more than the specified factor. For a scene with a very narrow range of Y values, currently it gets stretch all the way out to 0..255 (or 16..235). Setting luma_scale_limit to 2.0 would prevent autolevels from scaling Y by more than the range 0.5 to 2.0. For instance, I just got some film back and the leader, which is nominally gray but has almost no contrast, turns into a rainbow because the slight UV variations get scaled dramatically at the same time the luma is getting scaled from a range of (60..78) to (16..235) or whatever.

And since this filter provides both autolevels() and an autogamma() interface, I guess I would add a gamma_scale_limit along the same lines.

And if we are now talking about having a transfer function mapping the mathematically correct scaling to something more modest, should it be a simple linear ramp with clamping, or should it have a soft transition to the limit-clamped range? This filter already has too many parameters.

An idea I mentioned before that circumvents some of this is to not do processing in YUV space at all, but to map RGB or YUV to a better color space where chroma really is independent of luma (or whatever it is called in that color space), stats are taken, luma is adjusted, then map back to RGB or YUV.

I also noticed that 2.6 added a dither parameter to levels(), so I should probably add the same feature to autolevels().