I checked the code and this happens indirectly inside of a call to compareDirectoryTrees. So indeed, it would be hard to explicitly request a binary diff. Instead, could we detect if a file is binary, and if it is fall back to a binary diff? For example, using code like this. We could put this in a function named lit.util.is_binary(file) and inside of compareDirectoryTrees we could call this function and then branch to the appropriate diff function.

I spent some time today figuring out how to a) improve the time it takes the test to run and b) make the uploaded binary as small as possible. I'm uploading a binary so that not every test has to regenerate the same file (which dominates the running time of generating this file). I tried to make this binary as small and as compressible as possible. I got the compressed archive down to 147 kb which I think is acceptable. On my machine the total running time of decompressing, running llvm-objcopy, and then checking the data, is about 1.6 seconds which is acceptably fast in my opinion. This problem was much more solvable than when I implemented 64-bit symbol offsets for archives.

Thanks, that is very helpful. It's much more likely that Dominator info is up to date than loop info, but it's not worth getting into here. The only thing I request is that you update the comments to suggest this only probably makes sense to use if loopinfo is computed already, and please just add your description of how it works with the example to the comments.

Can you update the description to clarify that this is fixing a bug in the indexing library? From the description it sounds like we have a serious bug in FUNCDNAME codegen, which is not the case. CodeGen does the right thing. The ASTContext API is just crappy, so the Index library used it incorrectly. The fix is to simplify the ASTContext API so that it owns the mangling context.