Activity

I'm on the fence about allowing inline fn expressions and fn literals as guards. The problem is that they can't be checked for equality and thus tests cannot be shared across guard patterns. I need to think about it some more but I don't consider this high priority in the near future. Any decision will have to take into consideration the goal of predicate dispatch.

David Nolen
added a comment - 10/Jan/12 10:04 PM I'm on the fence about allowing inline fn expressions and fn literals as guards. The problem is that they can't be checked for equality and thus tests cannot be shared across guard patterns. I need to think about it some more but I don't consider this high priority in the near future. Any decision will have to take into consideration the goal of predicate dispatch.

Chris Gray
added a comment - 11/Jan/12 12:41 PM - edited Sorry, I think the fact that the functions in my earlier example were function literals was a bit of a red herring. The following test also fails.

I really don't see how, given that there seems to be no code that specializes on the type of function given to a guard. My guess is that when guard-pattern-match-5 succeeds, guard-pattern-match-4 will succeed as well.

Chris Gray
added a comment - 11/Jan/12 12:50 PM I really don't see how, given that there seems to be no code that specializes on the type of function given to a guard. My guess is that when guard-pattern-match-5 succeeds, guard-pattern-match-4 will succeed as well.

Oh sorry you are right. This is exactly same problem. We can't know that odd? and oddp are the same. Again this is not something I'm interested in fixing without a lot more consideration.

Basically functions can't be tested for equality like types can. This means that the presence of a guard must create a backtrack point. However if we make guards work a little more like types (you have to declare them ahead of time) we lose a little bit of convenience but gain a lot of reasoning power and can share tests and avoid these pitfalls.

David Nolen
added a comment - 11/Jan/12 1:00 PM Oh sorry you are right. This is exactly same problem. We can't know that odd? and oddp are the same. Again this is not something I'm interested in fixing without a lot more consideration.
Basically functions can't be tested for equality like types can. This means that the presence of a guard must create a backtrack point. However if we make guards work a little more like types (you have to declare them ahead of time) we lose a little bit of convenience but gain a lot of reasoning power and can share tests and avoid these pitfalls.
This discussion probably needs a design page.

They are not the same problem. These don't work for very different reasons. The first I probably won't even address and will leave for the community to deal with - I don't need robust pattern matching on regexes.

The second example is a legitimate bug around matching locals which is unrelated to this ticket. Feel free to open a new one for it.

David Nolen
added a comment - 11/Jan/12 6:58 PM They are not the same problem. These don't work for very different reasons. The first I probably won't even address and will leave for the community to deal with - I don't need robust pattern matching on regexes.
The second example is a legitimate bug around matching locals which is unrelated to this ticket. Feel free to open a new one for it.

On further reflection, what all these examples show is that Maranget's algorithm is only correct for literals whose equality you can test at compile time. Thus, not even locals will work using his algorithm. Regexes and functions will certainly not work correctly 100% of the time.

What happens is that when multiple unequal tests in the same column can return a truthy value, you end up with a decision forest rather than a decision tree. If the first decision tree in the forest has the first and third end-states, while the second tree has the second end-state, if you end up in the third end-state, you must still check the second decision tree before you decide which end-state is actually correct.

This is a shame, since it means that compiling the matches becomes more complex. On the other hand, it seems like a great subject for a paper at programming-language conference, so there's always that.

On a serious note, though, this bug is major, and you should consider removing support for at least guards, locals, and regexes until it is fixed. The bugs that arise from it in the end-user's code are really hard to track down – it's as if `or` or `and` were broken 10% of the time.

Chris Gray
added a comment - 12/Jan/12 11:20 AM On further reflection, what all these examples show is that Maranget's algorithm is only correct for literals whose equality you can test at compile time. Thus, not even locals will work using his algorithm. Regexes and functions will certainly not work correctly 100% of the time.
What happens is that when multiple unequal tests in the same column can return a truthy value, you end up with a decision forest rather than a decision tree. If the first decision tree in the forest has the first and third end-states, while the second tree has the second end-state, if you end up in the third end-state, you must still check the second decision tree before you decide which end-state is actually correct.
This is a shame, since it means that compiling the matches becomes more complex. On the other hand, it seems like a great subject for a paper at programming-language conference, so there's always that.
On a serious note, though, this bug is major, and you should consider removing support for at least guards, locals, and regexes until it is fixed. The bugs that arise from it in the end-user's code are really hard to track down – it's as if `or` or `and` were broken 10% of the time.

There is nothing wrong with Maranget's algorithm. We just have to be sure that we create a backtrack point - that's all.

Functions cannot be fixed because function equality is undecideable. So for guards we might create a backtrack point. I've already updated the README to describe what works at the moment. I have a branch which throws an error if :when is not given a vector of symbols or a symbol. It should probably be improved so that symbols are known top levels (no reassigning fns to locals).

Regex equality can probably be made to work but I'm not going to do it. (On further thought we can probably make patterns create backtrack points by default, can be overridden for those willing to make their patterns highly optimizeable)

Locals can be fixed, we'll definitely create a backtrack point for these.

David Nolen
added a comment - 12/Jan/12 11:29 AM - edited There is nothing wrong with Maranget's algorithm. We just have to be sure that we create a backtrack point - that's all.
Functions cannot be fixed because function equality is undecideable. So for guards we might create a backtrack point. I've already updated the README to describe what works at the moment. I have a branch which throws an error if :when is not given a vector of symbols or a symbol. It should probably be improved so that symbols are known top levels (no reassigning fns to locals).
Regex equality can probably be made to work but I'm not going to do it. (On further thought we can probably make patterns create backtrack points by default, can be overridden for those willing to make their patterns highly optimizeable)
Locals can be fixed, we'll definitely create a backtrack point for these.

I'm sorry, but this problem will not be solved by backtracking alone. At least not with the backtracking mechanism that currently exists.

With backtracking, you are still treating the problem as though you have a decision tree. A decision tree requires that all the tests at its nodes are mutually exclusive.

By assuming that you have a decision tree, once a match is found in the tree, that match is returned. As my last comment pointed out, that is not sufficient. You must also check to see if there is a match in an end-state that was declared earlier. I really don't see that that's possible given the current backtracking system.

Chris Gray
added a comment - 12/Jan/12 11:54 AM - edited I'm sorry, but this problem will not be solved by backtracking alone. At least not with the backtracking mechanism that currently exists.
With backtracking, you are still treating the problem as though you have a decision tree. A decision tree requires that all the tests at its nodes are mutually exclusive.
By assuming that you have a decision tree, once a match is found in the tree, that match is returned. As my last comment pointed out, that is not sufficient. You must also check to see if there is a match in an end-state that was declared earlier. I really don't see that that's possible given the current backtracking system.

I don't follow you. Maranget's algorithm is not sufficient for pattern matching if we don't constrain columns to specific types. There are many things already in place to deal with the shortcomings of Maranget's approach given what we want to accomplish - for example, we actually have a grouping pass. This is the same approach that Racket uses as far as I can tell and they don't have any problems. Certainly none of the patterns you shown thus that far (besides fns exprs) pose any challenges that I can see.

David Nolen
added a comment - 12/Jan/12 1:26 PM - edited I don't follow you. Maranget's algorithm is not sufficient for pattern matching if we don't constrain columns to specific types. There are many things already in place to deal with the shortcomings of Maranget's approach given what we want to accomplish - for example, we actually have a grouping pass. This is the same approach that Racket uses as far as I can tell and they don't have any problems. Certainly none of the patterns you shown thus that far (besides fns exprs) pose any challenges that I can see.

Sorry, I was editing my comment as you made yours. I hope it is more clear now.

I guess I don't totally understand your code still, so I will try to rectify that before commenting again. From what I have seen, though, you are trying to build a decision tree. What I am saying is that isn't possible at compile time, since you can't ensure that the nodes of the tree are mutually exclusive.

Chris Gray
added a comment - 12/Jan/12 1:42 PM Sorry, I was editing my comment as you made yours. I hope it is more clear now.
I guess I don't totally understand your code still, so I will try to rectify that before commenting again. From what I have seen, though, you are trying to build a decision tree. What I am saying is that isn't possible at compile time, since you can't ensure that the nodes of the tree are mutually exclusive.

All known constructors are considered mutually exclusive. We group all constructors in a column preserving order as closely as possible. Decisions trees are created for these constructors. If we cannot know at compile time whether something is mutually exclusive (wildcards, locals), we create a backtrack point to handle them.

Consider that if we only have backtrack points (no trees) all tests could potentially be tried. Our approach is a hybrid one - we don't rely only on decisions trees and we don't rely only on backtracking.

When we get to integrating pattern matching on interfaces, protocols ambiguities of course become possibly. But even this can probably be handled reasonably with something like "prefers".

David Nolen
added a comment - 12/Jan/12 1:51 PM - edited All known constructors are considered mutually exclusive. We group all constructors in a column preserving order as closely as possible. Decisions trees are created for these constructors. If we cannot know at compile time whether something is mutually exclusive (wildcards, locals), we create a backtrack point to handle them.
Consider that if we only have backtrack points (no trees) all tests could potentially be tried. Our approach is a hybrid one - we don't rely only on decisions trees and we don't rely only on backtracking.
When we get to integrating pattern matching on interfaces, protocols ambiguities of course become possibly. But even this can probably be handled reasonably with something like "prefers".

Where the numbers are the order in which the terminals appear, and they have stars beside them if they match. The correct terminal for core.match to return in this case is the second one. Currently, the code would return the third terminal. Suppose, however, that backtracking was added so that the tree rooted at b was checked for matches as well. This is certainly possible, though a lot of information would need to be kept about the match already found (which is what I meant about things not working with the current backtracking system). Also, you must ensure that the testing stops when you hit the second terminal, for two reasons – first, not doing so would imply that all terminals are checked, and second, the test to distinguish 4 from 5 could throw an exception. For similar reasons, the return value of the third terminal can't be computed – it could be a very long computation, or it could throw an exception.

Where the numbers are the order in which the terminals appear, and they have stars beside them if they match. The correct terminal for core.match to return in this case is the second one. Currently, the code would return the third terminal. Suppose, however, that backtracking was added so that the tree rooted at b was checked for matches as well. This is certainly possible, though a lot of information would need to be kept about the match already found (which is what I meant about things not working with the current backtracking system). Also, you must ensure that the testing stops when you hit the second terminal, for two reasons – first, not doing so would imply that all terminals are checked, and second, the test to distinguish 4 from 5 could throw an exception. For similar reasons, the return value of the third terminal can't be computed – it could be a very long computation, or it could throw an exception.

David Nolen
added a comment - 12/Jan/12 3:28 PM If the correct terminal is the second one, we will return the second one. No information needs to be kept around. I suggest you take a closer look at the code at this point.

Aah, you're right. (I think.) Might it be more accurate to say that the situation I proposed can't happen? That is, no two trees are created where there are lower-numbered terminals in the second tree than in the first tree?

Is the plan to add backtrack points for everything where equality can't be determined at compile time?

Chris Gray
added a comment - 12/Jan/12 6:58 PM - edited Aah, you're right. (I think.) Might it be more accurate to say that the situation I proposed can't happen? That is, no two trees are created where there are lower-numbered terminals in the second tree than in the first tree?
Is the plan to add backtrack points for everything where equality can't be determined at compile time?

David Nolen
added a comment - 13/Jan/12 9:15 AM - edited Wow, this is great. I've skimmed over the patches and they look pretty good. I will go over them more closely as soon as I can - there are a couple changes we should probably make. Thanks!

1. Let's rename mutually-exclusive-inequality? to comparable?
2. I want GuardPatterns to comparable, I'm not going to support arbitrary fns - not worth it IMO.
As a compromise I'd be willing to rename :when patterns -> PredicatePatterns and GuardPatterns
can be allowed via :guard.
3. No need to declare OrPatterns as comparable - they aren't real patterns
4. No need to exclude VectorPatterns - the way that pattern matching works makes this unnecessary

David Nolen
added a comment - 14/Feb/12 4:39 PM Sorry for the epic delay. Here are my notes on that patches:
1. Let's rename mutually-exclusive-inequality? to comparable?
2. I want GuardPatterns to comparable, I'm not going to support arbitrary fns - not worth it IMO.
As a compromise I'd be willing to rename :when patterns -> PredicatePatterns and GuardPatterns
can be allowed via :guard.
3. No need to declare OrPatterns as comparable - they aren't real patterns
4. No need to exclude VectorPatterns - the way that pattern matching works makes this unnecessary
If you make these changes I promise to apply these in a timely manner

Okay, patch uploaded. There is a fairly significant portion of it that is just cut and paste, which I'm not so happy about, but I don't think there's a way to do type inheritance, so I did the easier thing.

Chris Gray
added a comment - 15/Feb/12 12:50 PM Okay, patch uploaded. There is a fairly significant portion of it that is just cut and paste, which I'm not so happy about, but I don't think there's a way to do type inheritance, so I did the easier thing.