That is, make CHECK-DAG skip matches that overlap the matches of any
preceding consecutive CHECK-DAG directives. This change makes
CHECK-DAG more consistent with other directives, and there is evidence
it makes CHECK-DAG more intuitive and less error-prone. See the RFC
discussion starting at:

http://lists.llvm.org/pipermail/llvm-dev/2018-May/123010.html

Moreover, this behavior enables CHECK-DAG groups for unordered,
non-unique strings or patterns. For example, it is useful for
verifying output or logs from a parallel program, such as the OpenMP
runtime.

This patch also implements the command-line option
-allow-deprecated-dag-overlap, which reverts CHECK-DAG to the old
overlapping behavior. This option should not be used in new tests.
It is meant only for the existing tests that are broken by this change
and that need time to update.

See the following bugzilla issue for tracking of such tests:

https://bugs.llvm.org/show_bug.cgi?id=37532

This patch should not be committed until we have patches ready to add
-allow-deprecated-dag-overlap to those tests.

Reading this in the overlapping mode: The X set of DAGs will all match on line 424, the Y set will all match on line 425, therefore the NOTs for rule 1 will correctly not match anything. The Z set of DAGs will all match on line 428, therefore the NOTs for rule 2 will correctly not match anything. Then the NOTs for rule 3 will match, making the run fail.

Is that what you intended? Naively I would have thought you'd try to break each of the rules, but that's not what the test does.

I was mimicking check-dag-xfails.txt. I was thinking -LABEL wouldn't work well because each RUN would see only one -LABEL, but I suppose all RUNs could share a common CHECK prefix for -LABEL. I'll work on that. Thanks.

No. y also matches on line 424 because overlapping is permitted, so we get a reordering complaint. You can confirm this with the -vv option.

Huh. I would have expected the -NOT directives for rule 1 to break up the DAG lines into separate X and Y groups, and the Y group's starting position would necessarily have been the end of the X group's last match.
At least, that appears to be the intended behavior of the IdentPatNot part of the test?
I admit I have not downloaded and applied the patches, I'll try to do that tomorrow.

The reordering check is a bit odd, regardless of my patch. It looks for y from the start of X, but it looks for the rest of Y from the end of X. My patch makes sure y doesn't overlap with members of X.

Speaking of the reordering check, we should discuss this comment change.

I made this change to reflect the implementation, but now I'm wondering if the comment described the desired behavior and the implementation was wrong. I'm not really sure. Moreover, this comment change does not reflect any change made by this patch, so perhaps it should be in a separate patch.

The implementation works as follows (with or without this patch as there's no overlapping here). Using the X-Y model from earlier, the implementation seeks non-initial members of Y from the end of X rather than the start of Y. For example, the following succeeds even though the second member of Y matches before the first member of Y:

To make things more confusing, if I move the occurrence of foo right before the occurrence of def, the CHECK-NOT fails. That is, the CHECK-NOTs are sought before the first member of Y, but the other members of Y are sought both before and after Y. It might be more intuitive if the first member of Y marks the beginning of Y (as the comment originally claimed) so that CHECK-NOTs are sought strictly between all of X and all of Y.

First the X group of DAGs runs, over the entire input text. The farthest-forward match (here 'b') is remembered.
Then the Y group of DAGs runs, from the farthest-forward X match (i.e., after 'b') to the end of the input text. The earliest match (here 'ghi') is remembered.
Finally the NOT runs, in the range from the end of the X matches (after 'b') to the beginning of the earliest Y match (before 'ghi').

The X matches and the Y matches can never overlap, even though they are both DAG groups they are separated by the NOT. Therefore the search range of the Y group is constrained to start after the latest match of the X group.

Is that not what happens? Even with overlapping DAG matches enabled? Your explanation made it sound as if you expected the order of DAG directives in Y to make a difference to the NOT directive, when I think it should not.

First the X group of DAGs runs, over the entire input text. The farthest-forward match (here 'b') is remembered.

Agreed.

Then the Y group of DAGs runs, from the farthest-forward X match (i.e., after 'b') to the end of the input text.
The earliest match (here 'ghi') is remembered.

Only the first DAG in Y runs at this point, so 'def' is remembered.

Finally the NOT runs, in the range from the end of the X matches (after 'b') to the beginning of the earliest Y match (before 'ghi').

Before 'def'.

The X matches and the Y matches can never overlap, even though they are both DAG groups they are separated by the NOT. Therefore the search range of the Y group is constrained to start after the latest match of the X group.

Agreed.

Is that not what happens? Even with overlapping DAG matches enabled?

My patch and -allow-deprecated-dag-overlap shouldn't affect these examples as there is no overlapping.

Your explanation made it sound as if you expected the order of DAG directives in Y to make a difference to the NOT directive, when I think it should not.

One approach is what the original comment here suggested: make the first DAG in Y mark the beginning of all Y matches. Basically, for that first directive in Y, CHECK-DAG becomes CHECK. This approach seems very quick to implement. However, it constrains what DAGs in Y can match, so it might break existing tests.

Another approach is to do what you described: match all DAGs in Y and then make the earliest match delimit the end of the preceding NOTs. That will require more work to implement, I think. However, it doesn't constrain what DAGs in Y can match. Instead, it constrains what the preceding NOTs can match, and constraining NOTs helps tests pass not fail. The question is whether they will be false passes that we'll never hear about. My hunch is no.

While #2 seems harder to implement, it seems like it wouldn't causes tests to fail, and I think it produces more intuitive behavior. What do you think?

Therefore the search range of the Y group is constrained to start after the latest match of the X group.

Agreed.

I got confused here. The search range of the first DAG in Y starts where X's search range starts not where X's matches end. Otherwise, there would be no possibility of a reordering complaint.

While #2 seems harder to implement, it seems like it wouldn't causes tests to fail, and I think it produces more intuitive behavior. What do you think?

I think a group of DAGs should not behave differently just because it is preceded by a NOT, and so I still prefer #2.

I got confused here. The search range of the first DAG in Y starts where X's search range starts not where X's matches end. Otherwise, there would be no possibility of a reordering complaint.

That suggests that the first DAG in Y is being mis-handled as part of the X group. I don't think that makes any sense at all. Either a NOT cleanly separates the surrounding DAGs into two groups, or they are all part of one bigger group. Taking the lexically first DAG from the second group and moving it to the first group is just wrong.

I think a group of DAGs should not behave differently just because it is preceded by a NOT, and so I still prefer #2.

Agreed. I'll look into it. If it doesn't break anything, we can put it in before this patch.

So that I can keep this straight: The above is about the search range end for the NOTs before Y. The below is about the search range start for Y and thus reordering detection.

I got confused here. The search range of the first DAG in Y starts where X's search range starts not where X's matches end. Otherwise, there would be no possibility of a reordering complaint.

That suggests that the first DAG in Y is being mis-handled as part of the X group. I don't think that makes any sense at all. Either a NOT cleanly separates the surrounding DAGs into two groups, or they are all part of one bigger group. Taking the lexically first DAG from the second group and moving it to the first group is just wrong.

I agree the current reordering detection is broken. I think the two options you are proposing are the following:

Search for every DAG in Y from the end of X's matches until EOF. That would make reordering detection impossible, so we could just get rid of it.

Search for every DAG in Y from the start of X's matches until EOF. That way, reordering is detected for any Y member not just the first.

(Replying here instead of inline as the thread is getting kind of long.)

Sorry, I didn't realize FileCheck would try to detect and explicitly diagnose reordering DAG across NOT. That seems like more trouble than it's worth. NOT is documented to enforce ordering, and we can make it just do that, and if you mess up then your test fails by not matching rather than telling you something is in the wrong order. (Then you can use -v or -vv to work out what happened, if you need to.)

Once we define the DAGs as separated into distinct groups by a NOT, I think it's conceptually straightforward to say each DAG group has a "matching range" from the start of the earliest match to the end of the latest match. Two DAG groups can't have overlapping match ranges (just like you are making individual DAG matches not overlap).

Defining the search range of a DAG group does get a little fuzzy when it's not bounded by a normal CHECK. I think when that happens, we say it goes to the next LABEL, or start/end of file.

Similarly, if you have a NOT group, its range is bounded by the previous match (or last match of a preceding DAG group), and the following match (or first match of a following DAG group).

At that point, I think the behavior becomes tractable. The first DAG group finds what it finds, the NOT forces no reordering therefore the second DAG group's range starts at the endpoint of the first group's matches. After the second DAG group runs, the NOT's range is from the endpoint of the first group's matches to the earliest of the second group's matches.

How does that all sound? If we can agree on this behavior, then probably I should write it up for the dev list to get a broader audience, before launching ahead with the implementation. But (if I do say so myself) I think this kind of conceptual model has been lacking in FileCheck as it has grown new features, and it's worth laying it out explicitly.

(Replying here instead of inline as the thread is getting kind of long.)

Sorry, I didn't realize FileCheck would try to detect and explicitly diagnose reordering DAG across NOT. That seems like more trouble than it's worth. NOT is documented to enforce ordering, and we can make it just do that, and if you mess up then your test fails by not matching rather than telling you something is in the wrong order. (Then you can use -v or -vv to work out what happened, if you need to.)

Once we define the DAGs as separated into distinct groups by a NOT, I think it's conceptually straightforward to say each DAG group has a "matching range" from the start of the earliest match to the end of the latest match. Two DAG groups can't have overlapping match ranges (just like you are making individual DAG matches not overlap).

I think we should try to understand why reordering detection was implemented in the first place. I've imagined hypothetical examples where it could help, but I don't know if they occur in real tests. I generally have a hard time knowing what the right use cases for DAG-NOT-DAG are, and I imagine that's part of the motivation of the reordering detection: to help people not use it incorrectly. All of that can be part of the llvm-dev discussion you suggest.

Defining the search range of a DAG group does get a little fuzzy when it's not bounded by a normal CHECK. I think when that happens, we say it goes to the next LABEL, or start/end of file.

Yes, I believe that's what happens now.

Similarly, if you have a NOT group, its range is bounded by the previous match (or last match of a preceding DAG group), and the following match (or first match of a following DAG group).

Yes, we've agreed on that, and I've tinkered some with fixing the end of the range when followed by a DAG group. I think knowing whether that breaks existing tests would be helpful in any further discussion on that point.

At that point, I think the behavior becomes tractable. The first DAG group finds what it finds, the NOT forces no reordering therefore the second DAG group's range starts at the endpoint of the first group's matches. After the second DAG group runs, the NOT's range is from the endpoint of the first group's matches to the earliest of the second group's matches.

How does that all sound?

The main question remaining for me is about the start of the second DAG group's match range and thus whether reordering detection happens.

If we can agree on this behavior, then probably I should write it up for the dev list to get a broader audience, before launching ahead with the implementation.

I'm certainly fine with an llvm-dev discussion.

But (if I do say so myself) I think this kind of conceptual model has been lacking in FileCheck as it has grown new features, and it's worth laying it out explicitly.

Regarding the DAG-NOT-DAG reordering diagnostic: Rereading the comments, and the llvm-dev discussion, are we agreed that we can eliminate it? That is, the first DAG group defines some match range, the second DAG group defines a disjoint match range, and the NOT group looks at the text from the end of the first match range to the start of the second match range. No attempt to diagnose ordering problems.

The latest revision of the conceptual model says we won't allow SAME or NEXT after DAG. For purposes of determining valid/invalid directive sequences, the NOT wouldn't count. I think you still get the intended result without the SAME suffix.
I see you're wanting to (in effect) have intra-line DAG, but DAG doesn't provide that guarantee.

This is the first set of checks where there's a difference in behavior depending on overlap or not. Given the test is trying to exercise exactly that, the first two sets of checks aren't useful. (They might be appropriate for a general DAG functionality test, but not here.)

I skimmed the rest of the checks, and offhand the comments on the EndAfterEnd part likely apply to the other parts: (a) don't bother with sets of checks that don't overlap; (b) don't bother with redundant sets of checks that vary only in the number of overlapping characters.

Regarding the DAG-NOT-DAG reordering diagnostic: Rereading the comments, and the llvm-dev discussion, are we agreed that we can eliminate it? That is, the first DAG group defines some match range, the second DAG group defines a disjoint match range, and the NOT group looks at the text from the end of the first match range to the start of the second match range. No attempt to diagnose ordering problems.

Agreed. I have a patch to implement that. I'll clean it up and post it after we've cleared some of the existing queue. It applies afterward.

The latest revision of the conceptual model says we won't allow SAME or NEXT after DAG. For purposes of determining valid/invalid directive sequences, the NOT wouldn't count. I think you still get the intended result without the SAME suffix.

Generally speaking, CHECK and CHECK-SAME always behave the same when the output is as expected for CHECK-SAME, right? When the output is not as expected for CHECK-SAME, CHECK could match the same text on a later line, potentially delaying or eliminating the desired diagnostic.

I see you're wanting to (in effect) have intra-line DAG, but DAG doesn't provide that guarantee.

Yes, that's what I want. We don't have it, so I went for the next best thing.

Until we have an intra-line DAG, I'd like to request that we don't implement the restriction for SAME after DAG.

Even if we do implement an intra-line DAG, my gut says the SAME/NEXT after DAG restrictions eliminate reasonable use cases, but I don't have real-world examples right now, so I won't oppose the restrictions unless I come up with some examples.

I'm just doing boundary testing. None is a different class than one is a different class than multiple. Why is one different than multiple? Because off-by-one is a common bug that might cause incorrect behavior or mask another bug.

Regarding SAME/NEXT after DAG, I do not have a strong opinion. The question is mainly whether it will make sense to people who do not spend a lot of time studying how FileCheck works. I'm okay with leaving them in.

In the case of string pattern matching I disagree that one is a different class than multiple. I see it as a ternary <=> result, and it doesn't matter how far unequal the substring boundaries are.
The advantage to removing it is that it's one less case for a human to understand when reading the test.

Regarding SAME/NEXT after DAG, I do not have a strong opinion. The question is mainly whether it will make sense to people who do not spend a lot of time studying how FileCheck works.

Your formalized description is what gives me hope that we can describe these behaviors in a way that people will understand, but I agree it's better if it's just intuitive without studying documentation.

I'm okay with leaving them in.

In that case, I say we let it be for now. We can revisit after we finish this series of patches and after we consider an intra-line DAG. By then, more concrete examples might arise.

In the case of string pattern matching I disagree that one is a different class than multiple. I see it as a ternary <=> result, and it doesn't matter how far unequal the substring boundaries are.

It might not matter in one particular correct implementation, but it can matter in subtle ways as implementations evolve clever new techniques and break.

The advantage to removing it is that it's one less case for a human to understand when reading the test.

I don't agree that removing the tests for the "multiple cases" is going to make this test file substantially easier to understand. Once you understand the general sequence of tests here, understanding the difference between one and many seems like very little more to ask. I'm happy to add comments if you feel that would help.

Obviously, there are no hard rules on how to design test cases effectively, so we write them based on our own experiences and on general guidelines about things like off-by-one. In the past, I've spent a lot of time writing tests for parsers, and I've found that carefully testing boundary conditions in this way is important.

I feel pretty strongly about this, but I don't want to have to stall this patch indefinitely because we have different views on proper testing. If you're adamant, I'll defer. Alternatively, we could invite a third opinion to break the tie. You can pick whomever.

That's fine (clearly we'd remove the RUN lines with the deprecated switch, when that gets removed).
But in that case, there are already a bunch of tests for non-overlapping DAG lines in other check-dag*.txt tests, and it distracts from the focus of this test to repeat them here. That would be the first two sets of checks in EndAfterEnd, and corresponding sets in the other cases.
(Or if you think the existing DAG tests in the other files are inadequate, beef up those tests, and keep this one focused on overlap.)

That's fine (clearly we'd remove the RUN lines with the deprecated switch, when that gets removed).

Right.

But in that case, there are already a bunch of tests for non-overlapping DAG lines in other check-dag*.txt tests, and it distracts from the focus of this test to repeat them here. That would be the first two sets of checks in EndAfterEnd, and corresponding sets in the other cases.
(Or if you think the existing DAG tests in the other files are inadequate, beef up those tests, and keep this one focused on overlap.)

I see check-dag-overlap-torture.txt as a careful, organized sweep through all the various cases basic overlap detection must handle. In my opinion, dropping chunks of it that happen to be covered for some other purpose elsewhere makes this test more confusing, and it makes the overall testing of overlap detection less maintainable. I don't think those chunks distract from the focus of this test because the focus of this test is the correct behavior of the overlapping detection mechanism, and that includes no false positives, especially at boundary conditions.

As in the discussion of one vs. multiple, I'm pretty adamant, and I don't feel like we're converging. Just give me the word, and I'll yield, or you can invite a third opinion if you know someone who might offer another perspective.

I'm not trying to be difficult on these points. I just disagree. But I'd still like us to find a path forward. A third person might help.

At that point, I think the behavior becomes tractable. The first DAG group finds what it finds, the NOT forces no reordering therefore the second DAG group's range starts at the endpoint of the first group's matches. After the second DAG group runs, the NOT's range is from the endpoint of the first group's matches to the earliest of the second group's matches.

How does that all sound?

The main question remaining for me is about the start of the second DAG group's match range and thus whether reordering detection happens.

For the record, I'm pretty sure I meant search range here. We didn't have formal definitions for that terminology yet, and I must have gotten it confused.

To be clear, when I say "I'll yield" or "I'll defer", I mean that I'm willing to do as you ask in order to proceed. Of course, I'd rather us reach agreement on what's right, but it doesn't seem like we're making arguments that are persuasive to each other, so I suggested the alternative of inviting a third person, and I'm hoping you know someone.

We never really resolved this issue. Given our agreement about how DAG-NOT-DAG will work in the future (as described in your formal model, and as I've implemented in a patch I still need to post), I believe the only aspects of this test that must change are the comments about reordering detection... because there will be no more reordering detection.

Otherwise, I'm inclined to think this test still makes sense because it exercises overlapping cases that used to be mishandled. Does that make sense to you?

I think the only issues left to address in this thread are: (1) remove reordering detection and fix search ranges, and (2) update accordingly the comment that this thread was originally discussing. Again, #1 happens in a later patch I still need to post, and #2 will be addressed there too. Before that patch, the comment change as it appears above is accurate, so I think I should just leave it be for now. For those reasons, I'm marking these comments as done, but let me know if you disagree.

Given that you feel so very strongly about it, and that this is a tool that really has absolutely zero exposure outside our own test system, I'm willing to say that erring on the side of extra testing is acceptable here.

Given that you feel so very strongly about it, and that this is a tool that really has absolutely zero exposure outside our own test system, I'm willing to say that erring on the side of extra testing is acceptable here.

Hi Paul. Thanks very much for your consideration. I apologize for being difficult.

Those need to be committed at the same time as this one to avoid lots of buildbot failures. They're large but simple-minded patches: I just blindly added -allow-deprecated-dag-overlap to tests that break after adding this patch. Given that there are many run lines per test, it's possible that some additions weren't actually necessary, but they don't cause new unexpected failures for me, and our plan was to investigate these tests one by one afterward anyway. I'm currently pulling, rebasing, and re-running tests locally. I'll extend those patches for any new failures afterward.

Would you please accept those patches as well? Combing through every line might be a lot to ask, so you could accept on the condition that I actually did what I said I did, and I'll take responsibility for any problems.