User Details

Recent Activity

Today

Thanks, that is very helpful. It's much more likely that Dominator info is up to date than loop info, but it's not worth getting into here. The only thing I request is that you update the comments to suggest this only probably makes sense to use if loopinfo is computed already, and please just add your description of how it works with the example to the comments.

Yesterday

Hey, can you help me understand the algorithm?
I don't have a good feel for the invariants you expect going in.
It just says "it works on anything given loop info".
IE What info you expect the loopinfo to provide vs the topo ordering.

Sun, Feb 18

To be clear, i think the lifetime system has enough issues that trying to cover it with MemorySSA as it currently exists wouldn't end up well.
Things that want to care about and try to optimize around lifetimes can do that.
Intermixing object lifetime and aliasing results in the same system will, IMHO, give you a bad time.

Wed, Feb 14

Personally, i totally think this is worthwhile and worth spending some time on to prevent the slow proliferation of 10+ argument functions taking analysis results. Because that's the path we are heading down, and that's definitely hard to refactor, change, etc
But i'm also not the person who would do the work (though happy to review).

An example of that is getBestSimplifyQuery, which takes various analysis managers, etc, and extracts the analysis it cares about them.
If we did similar, we could simplify what gets passed through and easily add updating of a new thing.

So this looks right to me.
I'm a little worried about how we are having to modify tons and tons of functions to push these through.
When we get to postdominator updates, we'll have to do the same thing.

Wed, Feb 7

LVI functions fine without DT, and it's going to be too expensive to ever use a lazy analysis that is constantly recomputing info like LVI because it may require hundreds or thousands of Dominator tree rebuilds. I would hand LVI a fake dt pointer class where when the Dominator tree is up to date, it works and when it's not up to date, it is null. That would work with how LVI currently checking I think. Regardless I think the right answer here is to make it not use it when it's not up to date. That will give better answers then it used to get and you get gradually better answers as you can preserve more or become less lazy

I think PR36133 could be considered a different issue. LVI currently uses a domtree for certain queries involving llvm.assume; to make that work, the domtree would have to be up-to-date every time we call into LVI. But we could solve that separately from the problem of dead code in LVI.

Jan 22 2018

Is it a thing that is even different for each callgraph implementation? (If not, why is it a trait?)
Even if it is, if you look, you'll see we don't try to encode random computed Node property differences in GraphTraits. Only the things that *every* graph algorithm needs (IE it caters to anything, not everything). Random node property differences are handled through other trait classes or filtering. This does not, at a glance, seem like a thing that literally every algorithm that is going to use a Callgraph needs. It looks like one that some may need. You even prove this yourself.

I don't understand why you need has_function_def. It looks very much like a thing you should just be using a filter_iterator for or something. It does not look like it belongs in the graphtraits at all.

Jan 19 2018

I like the idea here a lot, but i don't understand why you need to make a new type of graph traits with things called call_edge_begin/end.
Can you explain the necessity here?
(IE why can you just not implement a graphtraits, why do you need a callgraphtraits)

Note also that we mark all operands that are part of the expression as Deps. If the leader of those operands changed, they would already be marked as changed as Users through the addAdditionalUsers part we do.

Nov 27 2017

So, what Sanjoy has pointed out is that canonicalization like this to select, breaks the ability to remove redundant loads and stores and do PRE, compared to the equivalent control flow version, in general.

Nov 9 2017

Sorry, I didn't really look at the jump-threading testcase in the other patch. (In general, it's bad practice to write testcases which involve multiple passes because it confuses the issue you're actually trying to fix.) We don't need to strip the nsw in pre-jt-add.ll from the other patch, and we don't need to strip nsw for this testcase.

The PRE transform in the testcase in this patch can be split into two parts: one, we clone the load and move it into the predecessors, and two, we eliminate the fully redundant loads. Cloning the load is obviously safe: it's the same operation. Eliminating the fully redundant load is also safe: storing the value to memory and loading it does not change it. Stores do not erase poison; you just get poisoned memory.

Your patch has convinced me that GVN is probably generally broken here, but just happens to not value number or create new phi nodes except for PRE, so it doesn't generate a lot of instances of this issue.
So yeah, i'm fine with the patch in general

Oct 31 2017

I'll repeat what i said on the bug
FWIW: Given the choice, i would much rather fix the underlying problem then patch various passes to work around some small subset of it (here, load of selected addresses)
This is underway right now, actually.

Oct 22 2017

So, in the original test and code it's hard to tell, but yes, if it's trying to undo the propagation of constants into phi nodes, it's already going to be fighting, because everything else is trying to do that transform specifically :)

Oct 17 2017

"Even if we mark for deletion, we should update the map immediately to be able to hoist across the marked instruction. Marking alone does not solve the problem, it will still need the same code that also updates the map. If marking is only needed for uniformity of GVN code, this can be done separately from fixing the actual bug."
The instructions will be deleted and the map updated as soon as we return from this function up the call stack.
It will not prevent further anything, because

the hoisting has already happened (and all the functions in this stack are going to return and then the next thing that will happen is we will delete the dead instructions)

The marked instruction could not have blocked hoisting anyway, or else it would not have been safe to PRE.

Both of the deletions are in PRE, after it has been successful. If either the inserted or the deleted instructions were implicit control flow, PRE should have been unsafe. It's probably worth noting this in a comment

Oct 13 2017

Implement the previously discussed approach using generic LatticeKeys. A few things to note about this version of the patch:

First, there's a requirement that LatticeKeys are able to be the key_type of whatever mapping structure the generic solver uses internally (i.e., DenseMap). If a client uses a non-trivial type for LatticeKey, it will have to provide a specialization of DenseMapInfo. But we already have specializations for pointers and PointerIntPairs, which I think probably covers most potential clients. I've added a comment at the top of AbstractLatticeFunction mentioning this.

Oct 11 2017

I do not plan to address these at the moment, because the order is consistent in the pass and doesn't affect correctness or what gets optimized.
If someone else wants to try to make everything reverse order clean, they are welcome to, it's not a priority for me.

So, i can't find what the suggestion was, but these are fairly specific to value lattices and don't belong being used by most other things.
There are lattices where they are correct, and lattices where they make no sense.
It's also kinda conservative.

Oct 5 2017

I've taken some time to go through the code in NewGVN that computes similar things based on the suggestion from Danny. These now do *very* similar things in terms of walk. The differences I've seen are:

While the safety checks are both preorder, my code uses DFS instead of BFS. The reason is that we walk up the stack to mark more nodes as unsafe when we discover an unsafe node.

It's worth noting: You can make propagation much faster by keeping a separate worklist for overdefined , and prioritizing it over the normal one.
This will move things to overdefined as quickly as possible.
(Because anything meet overdefined is just about always overdefined).