OK, so you want to prevent candidate generation to occur for filter
terms which might result in large candidate sets. First of all, assuming
that that's even a valid thing to do (noting your issues listed above) I
would just define a new limit analogous to sizelimit.unchecked, and skip
the probability guessing games. E.g. sizelimit.intermediate which would
be checked at intermediate stages of filter evaluation. That would
render sizelimit.unchecked moot.

The implementation would apply this limit to each individual filter term
lookup, and fail with ADMINLIMIT_EXCEEDED when any term exceeds the limit.

>

In practice I think this will cause a lot of harm though; it will cause
ANDed filters to fail that would otherwise come in under the unchecked
limit.

No, not to each filter term: that's the naive (wrong :) way to do it.
I'd apply it to the result of the speculation. Each filter term would
contribute to computing the final score, which determines if the limit
(let's say the action) applies or not.

I'm not saying I want to implement anything like that. I'm saying
that's how I'd design a filter analyzer. Actually, that's how we
designed it for a slightly different, and much more deterministic
problem: detect when a composite filter complies with a certain, very
well defined paradigm; sort of:

assuming there's a deterministic way to determine when
a simple filter complies with a certain paradigm,