The invalid handle isn't good for accessing anything, but it can still identify a unit.

Not really. If you say have a Loop* pointer, which is freed and then another Loop allocated happens to be just at that pointer
then you dont have a way to identify whether it is the old Loop or the new one.
Presumably that does not happen now...

Mon, Dec 10

Invalid handle might be totally invalid, you can not use it for anything.
I do remember having accesses to freed memory when accessing either Loop or SCC, dont remember which one.
Even looking at the address might be not that sane if it was freed and perhaps something else allocated at that place.

In case the IRUnit has been invalidated, can we still pass it to the callback? When I think of instrumentations that do more complicated things than printing, it's highly likely that they would like to be notified when an IRUnit handle becomes invalid. Which requires passing the handle. If we are passing the handle, though, we need to emphasize that the unit isn't accessible through the handle anymore. Maybe by passing void* instead, or some tag type Invalidated<T> { void* Handle; }. The latter can go through the any and allow deducing the actual IRUnit type.

Mon, Dec 3

If asan doesn't modify the module for this ('this' meaning the Mapping and the GlobalsMD here), than this is an excellent candidate for an analysis.

When you say analysis, do you mean like the TargetLibraryInfoWrapperPass that AddressSanitizer is dependent on? Not sure what the difference between passes and analyses are, but if the purpose of TargetLibraryInfoWrapperPass is to just pass a TargetLibraryInfo to AddressSantizer, then it seems a new WrapperPass could similarly be made for this initialized data as a custom class made specifically for ASan.

I am not an expert or have a good context, please excuse my trespassing, just as a guy reading the code and trying to understand, this is my understanding:

GlobalsMD.init(M); is the only expensive part of the AddressSanitizer initialization.

All data in GlobalsMD is obtained from a module, in GlobalsMetadata::init(Module& M).

(Maybe this is not obvious) Function objects have a reference to their module via Function::getParent() method.

Therefore, every expensive data that the function pass for AddressSanitizer needs, can be stored in Module objects:

Up to this point it kinda makes sense.
However, this suggestion of storing supplementary passes information into the IR itself does not sound like a good design decision.
A typical solution for LLVM to store supporting information is to introduce an Analysis, so it kinda floats around the IR, but
is not being stored directly into it.

i.e. the problem is that 'depth 2' messages are indented, and (for some reason) the tests want them all sorted by message.

The tests are trying to count loops at different depths, using -COUNT-<NUM> directives, e.g. 11 loops at depth 1, 40 loops at depth 2.
Alternative to sorting would be to implement support for -COUNT-<NUM>-DAG.

Eventually we should implement error checking for new-pm pass names.
That will require adding more new-pm related code to the parser, and introduce some
kind of an interface with PassBuilder. Definitely *not* planned for this patch.

The test introduced here seems to be invalid.
It assumes that 16*128 attempts to choose a random hex number will cover all 16 available numbers.
Obviously, the chance for this test to fail is rather low but it is not 0.

To me it would be a cleaner solution if you could split legacy FunctionPass interface implementation into a separate wrapper pass (e.g. ScalarizerLegacyPass),
similar to how new-pm wrapper ScalarizerPass implements all the new-pm interfaces.
That would allow to separate actual transformation (Scalarizer) from pass-manager details (ScalarizerPass/ScalarizerLegacyPass).

Discovered a tricky testcase with 16-way switch and nested exiting branches which manages to skip the multiplier introduced here and still lead to exponential explosion.
Definitely need to check if exiting branch dominates the latch before skipping its multiplier calculation.
Also the testcase shows that we really need to go further calculating costs per candidate and using that in calculation of exponential-explosion threshold, just as Chandler asked to.

Making this a module pass will cost us all the nice chaining and cache locality we get from a function pass.
That will make the already slow instrumentation even more expensive. Do you see a way to keep this as a function pass?

I assume the main problem with this being function pass in NewPM is the need for initialization, which is too slow when being done per-function.