Omer, could you also include in your patch a few test cases? At least one that validates that two regex's that should be equal, are equal, and another that two regex's that should be different, are non-equal. Preferably the first of those tests should fail with the existing Clojure code, and pass with your changes.

Andy Fingerhut
added a comment - 12/Mar/13 5:54 PM Omer, could you also include in your patch a few test cases? At least one that validates that two regex's that should be equal, are equal, and another that two regex's that should be different, are non-equal. Preferably the first of those tests should fail with the existing Clojure code, and pass with your changes.

I updated the patch with some tests. Hope I added them in the correct file. I also changed the implementation a bit, by instead of adding another implementation of equiv with a different signature, checking the type of the Object in the equiv method with signature (Object k1, Object k2).
For the sake of consistency I also added an EquivPred implementation, though I'm not entirely sure when its used.

Omer Iqbal
added a comment - 13/Mar/13 5:39 AM I updated the patch with some tests. Hope I added them in the correct file. I also changed the implementation a bit, by instead of adding another implementation of equiv with a different signature, checking the type of the Object in the equiv method with signature (Object k1, Object k2).
For the sake of consistency I also added an EquivPred implementation, though I'm not entirely sure when its used.

All comments here refer to the patch named fix-CLJ-1182.diff dated Mar 13, 2013.

The location for the new tests looks reasonable. However, note that your new patch has your old changes plus some new ones, not just the new ones. In particular, the new signature for equiv is still in your latest patch. You should start from a clean pull of the latest Clojure master and make only the changes you want when creating a patch, not build on top of previous changes you have made.

Also, there are several whitespace-only changes in your patch that should not be included.

Andy Fingerhut
added a comment - 13/Mar/13 1:04 PM All comments here refer to the patch named fix-CLJ-1182.diff dated Mar 13, 2013.
The location for the new tests looks reasonable. However, note that your new patch has your old changes plus some new ones, not just the new ones. In particular, the new signature for equiv is still in your latest patch. You should start from a clean pull of the latest Clojure master and make only the changes you want when creating a patch, not build on top of previous changes you have made.
Also, there are several whitespace-only changes in your patch that should not be included.

I uploaded a clean patch, removing the whitespace diff and only adding the latest changes. Thanks for clarifying the workflow!
Just to clarify, this refers to the patch named fix-CLJ-1182.diff dated Mar 13 1:34 PM

Omer Iqbal
added a comment - 13/Mar/13 1:39 PM I uploaded a clean patch, removing the whitespace diff and only adding the latest changes. Thanks for clarifying the workflow!
Just to clarify, this refers to the patch named fix-CLJ-1182.diff dated Mar 13 1:34 PM

I am recategorizing this as an enhancement, because if this is a bug it is a bug in Java – the Java Patterns class documents being immutable, but apparently does not implement .equals.

Other recent "make Clojure more complicated to work around problems in Java" patches have been rejected, and I suspect this one will be too, because it might impact the performance of equality everywhere.

Stuart Halloway
added a comment - 29/Mar/13 5:46 AM I am recategorizing this as an enhancement, because if this is a bug it is a bug in Java – the Java Patterns class documents being immutable, but apparently does not implement .equals.
Other recent "make Clojure more complicated to work around problems in Java" patches have been rejected, and I suspect this one will be too, because it might impact the performance of equality everywhere.

At first pass, Rich and I both believe that, as regex equality is undecidable, that Java made the right choice in using identity for equality, that this ticket should be declined, and the ClojureScript should revert to match.

But leaving this ticket open for now so that ClojureScript dev can weigh in.

Stuart Halloway
added a comment - 12/Apr/13 9:04 AM At first pass, Rich and I both believe that, as regex equality is undecidable, that Java made the right choice in using identity for equality, that this ticket should be declined, and the ClojureScript should revert to match.
But leaving this ticket open for now so that ClojureScript dev can weigh in.

Alexander Redington
added a comment - 12/Apr/13 10:03 AM If Regex instances were interned by the reader, but still used identity for equality, then code like
(= #"abc" #"abc")
would return true, but it wouldn't diverge in the definition of equality for regular expressions between Java and Clojure.

Undecidable means that for any given regular expression, there is no single way to write it. For example #"[a]" #"a" both match the same strings, but are they the same? Maybe. But how can we decide if /any/ given regular expression matches all of the same strings as any other? The answer is, you can't. Java does provide a Pattern#pattern method that returns the string that was used to build it, but I'm not sure if that could/should be used as a basis for equality given the potential perf hit.

Fogus
added a comment - 12/Apr/13 10:13 AM Undecidable means that for any given regular expression, there is no single way to write it. For example #"[a]" #"a" both match the same strings, but are they the same? Maybe. But how can we decide if /any/ given regular expression matches all of the same strings as any other? The answer is, you can't. Java does provide a Pattern#pattern method that returns the string that was used to build it, but I'm not sure if that could/should be used as a basis for equality given the potential perf hit.

From my post to the ml thread, it might not be entirely clear, why I don't think we should implement equality for host regexes:

It would involve parsing and would leave a lot of room for errors and platform-idiosycracies to leak. And it would be different for every platform.

As Alexander said, I also think this ticket could be resolved by interning regex literals, akin to keywords. That, however, would warrant a design page first, because there are tradeoffs to be made about how far the interning should go.

Herwig Hochleitner
added a comment - 12/Apr/13 9:42 PM From my post to the ml thread, it might not be entirely clear, why I don't think we should implement equality for host regexes:
It would involve parsing and would leave a lot of room for errors and platform-idiosycracies to leak. And it would be different for every platform.
As Alexander said, I also think this ticket could be resolved by interning regex literals, akin to keywords. That, however, would warrant a design page first, because there are tradeoffs to be made about how far the interning should go.

Rich Hickey
added a comment - 13/Apr/13 8:51 AM Why are we spending time on this? Where is the problem statement? Does someone actually need this for a real world purpose not solved by using regex strings as keys?