Details

Description

range should implement IReduce for fast reduction.

Patch: clj-1515-14.patch - Java range with long optimizations

Approach: The clj-1515-14 patch revives the unused clojure.lang.Range class. This class is similar in implementation to ChunkedCons. It differs by lazily constructing the first chunk (this is done in the existing impl by using a LazySeq wrapper) and by caching the next reference like other seq impls. The latter is done because range is frequently used in both chunked and unchunked traversal.

Additionally, 1515-14 contains an optimized version of range (LongRange) for the extremely common case where start, end, and step are all longs. This version uses primitive longs and primitive math and a customized version of ArrayChunk for greater performance.

The special case of (range) is just handled with (iterate inc' 0) (which was further optimized for reduce in CLJ-1603).

Alternatives:

CLJ-1515-deftype3.patch took the approach of using deftype to create a seqable, reducible entity that was not actually a seq. Based on work for CLJ-1603, it was established that due to historical requirements, this is not viable and the entity returned for range should implement the ISeq and Collection interfaces directly.

clj-1515-11.patch uses the approach of a "split" implementation - a LazySeq that has a fast reduce path. This is a minimal patch that provides no perf difference for sequence usage but a moderate improvement for reduce paths. One important difference vs clj-1515-12 is that the fast reduce path is only obtained on the initial range. Once you walk off the head, the performance will be the same as prior (seq perf).

Performance

criterium quick-bench with java 1.8.0-b132

code

1.7.0-master

Java clj-1515-14

split clj-1515-11

(count (range (* 1024 1024)))

63 ms

0 ms

58 ms

(reduce + (map inc (range (* 1024 1024))))

50 ms

33 ms

50 ms

(reduce + (map inc (map inc (range (* 1024 1024)))))

68 ms

52 ms

67 ms

(count (keep odd? (range (* 1024 1024))))

73 ms

52 ms

67 ms

(transduce (comp (map inc) (map inc)) + (range (* 1024 1024)))

46 ms

26 ms

34 ms

(reduce + 0 (range (* 2048 1024)))

67 ms

19 ms

40 ms

(reduce + 0 (rest (range (* 2048 1024))))

67 ms

19 ms

57 ms

(doall (range 0 31))

1.04 µs

0.75 µs

1.15 µs

(doall (range 0 32))

1.08 µs

0.76 µs

1.18 µs

(doall (range 0 4096))

135 µs

96 µs

145 µs

(into [] (map inc (range 31)))

1.81 µs

1.30 µs

1.88 µs

(into [] (map inc) (range 31))

1.71 µs

0.75 µs

1.02 µs

(into [] (range 128))

4.82 µs

2.20 µs

3.27 µs

(doall (range 1/2 1000 1/3))

1.24 ms

1.24 ms

1.28 µs

(doall (range 0.5 1000 0.33))

126 µs

147 µs

151 µs

(into [] (range 1/2 1000 1/3))

1.26 ms

1.20 ms

1.26 ms

(into [] (range 0.5 1000 0.33))

148 µs

91 µs

130 µs

(count (filter odd? (take (* 1024 1024) (range))))

185 ms

167 ms

176 ms

(transduce (take (* 1024 1024)) + (range))

67 ms

35 ms

68 ms

Performance notes:

1515-14 (Java) - significant improvements in virtually all use cases, both seq and reduce. The final two cases with (range) leverage optimizations from CLJ-1603.

1515-11 (split) - a much smaller patch that gives no seq benefits and smaller reduction benefits. Only the head of the range will receive the perf benefits (see (reduce + 0 (rest (range (* 2048 1024))))).

Questions
(range) and supports auto-promotion towards infinity in this patch, which seems to be implied by the doc string but was not actually implemented or tested correctly afaict.

Activity

1) Not sure about losing chunked seqs - that would make older usage slower, which seems undesirable.
2) RangeIterator.next() needs to throw NoSuchElementException when walking off the end
3) I think Range should implement IReduce instead of relying on support for CollReduce via Iterable.
4) Should let _hash and _hasheq auto-initialize to 0 not set to -1. As is, I think _hasheq always would be -1?
5) _hash and _hasheq should be transient.
6) count could be cached (like hash and hasheq). Not sure if it's worth doing that but seems like a win any time it's called more than once.
7) Why the change in test/clojure/test_clojure/serialization.clj ?
8) Can you squash into a single commit?

Alex Miller
added a comment - 29/Aug/14 3:19 PM 1) Not sure about losing chunked seqs - that would make older usage slower, which seems undesirable.
2) RangeIterator.next() needs to throw NoSuchElementException when walking off the end
3) I think Range should implement IReduce instead of relying on support for CollReduce via Iterable.
4) Should let _hash and _hasheq auto-initialize to 0 not set to -1. As is, I think _hasheq always would be -1?
5) _hash and _hasheq should be transient.
6) count could be cached (like hash and hasheq). Not sure if it's worth doing that but seems like a win any time it's called more than once.
7) Why the change in test/clojure/test_clojure/serialization.clj ?
8) Can you squash into a single commit?

1) I agree, adding chunked seqs to this will dramatically increase complexity, are we sure we want this?
2) exception added
3) I can add IReduce, but it'll pretty much just duplicate the code in protocols.clj. If we're sure we want that I'll add it too.
4) fixed hash init values, defaults to -1 like ASeq
5) hash fields are now transient
6) at the cost of about 4 bytes we can cache the cost of a multiplication and an addition, doesn't seem worth it?
7) the tests in serialization.clj assert that the type of the collection roundtrips. This is no longer the case for range which starts as Range and ends as a list. The change I made converts range into a list so that it properly roundtrips. My assumption is that we shouldn't rely on all implementations of ISeq to properly roundtrip through EDN.
8) squashed.

Timothy Baldridge
added a comment - 29/Aug/14 3:40 PM 1) I agree, adding chunked seqs to this will dramatically increase complexity, are we sure we want this?
2) exception added
3) I can add IReduce, but it'll pretty much just duplicate the code in protocols.clj. If we're sure we want that I'll add it too.
4) fixed hash init values, defaults to -1 like ASeq
5) hash fields are now transient
6) at the cost of about 4 bytes we can cache the cost of a multiplication and an addition, doesn't seem worth it?
7) the tests in serialization.clj assert that the type of the collection roundtrips. This is no longer the case for range which starts as Range and ends as a list. The change I made converts range into a list so that it properly roundtrips. My assumption is that we shouldn't rely on all implementations of ISeq to properly roundtrip through EDN.
8) squashed.

Alex Miller
added a comment - 01/Oct/14 2:00 PM - edited The new patch looks good. Could you do a test to determine the perf difference from walking the old chunked seq vs the new version? If the perf diff is negligible, I think we can leave as is.
Another idea: would it make sense to have a specialized RangeLong for the (very common) case where start, end, and step could all be primitive longs? Seems like this could help noticeably.

I think penalizing existing code possibly 5x is a hard cost to stomach. Is there another approach where a protocolized range can live outside of core? CLJ-993 has a patch that makes it a reducible source in clojure.core.reducers, but it's coll-reduce not IReduce, and doesn't contain an Iterator. Otherwise we might have to take the chunked seq challenge.

Alex: Re long/float. Old reified Ranged.java in clojure.lang blindly assumes ints, it would be nice to have a long vs. float version, though I believe the contract of reduce boxes numbers. (Unboxed math can be implemented very nicely as in Prismatic's Hiphip array manipulation library, which takes the long vs float specialization to the extreme with different namespaces)

Ghadi Shayban
added a comment - 03/Oct/14 10:22 AM I think penalizing existing code possibly 5x is a hard cost to stomach. Is there another approach where a protocolized range can live outside of core? CLJ-993 has a patch that makes it a reducible source in clojure.core.reducers, but it's coll-reduce not IReduce, and doesn't contain an Iterator. Otherwise we might have to take the chunked seq challenge.
Alex: Re long/float. Old reified Ranged.java in clojure.lang blindly assumes ints, it would be nice to have a long vs. float version, though I believe the contract of reduce boxes numbers. (Unboxed math can be implemented very nicely as in Prismatic's Hiphip array manipulation library, which takes the long vs float specialization to the extreme with different namespaces)

I don't think anyone is suggesting we push unboxed math all the way down through transducers. Instead, this patch contains a lot of calls to Numbers.*, if we were to assume that the start end and step params of range are all Longs, then we could remove all of these calls and only box when returning an Object (in .first) or when calling IFn.invoke (inside .reduce)

Timothy Baldridge
added a comment - 03/Oct/14 10:38 AM I don't think anyone is suggesting we push unboxed math all the way down through transducers. Instead, this patch contains a lot of calls to Numbers.*, if we were to assume that the start end and step params of range are all Longs, then we could remove all of these calls and only box when returning an Object (in .first) or when calling IFn.invoke (inside .reduce)

I agree that 5x slowdown is too much - I don't think we can give up chunked seqs if that's the penalty.

On the long case, I was suggesting what Tim is talking about, in the case of all longs, create a Range that stores long prims and does prim math, but still return boxed objects as necessary. I think the only case worth optimizing is all longs - the permutation of other options gets out of hand quickly.

Alex Miller
added a comment - 03/Oct/14 10:46 AM I agree that 5x slowdown is too much - I don't think we can give up chunked seqs if that's the penalty.
On the long case, I was suggesting what Tim is talking about, in the case of all longs, create a Range that stores long prims and does prim math, but still return boxed objects as necessary. I think the only case worth optimizing is all longs - the permutation of other options gets out of hand quickly.

Timothy Baldridge
added a comment - 31/Oct/14 11:30 AM Here's the latest work on this, a few tests fail. If someone wants to take a look at this patch feel free, otherwise I'll continue to work on it as I have time/energy.

Alex, while this is still a work in progress, I see that the change on ArrayChunk#reduce from previous WIP patches not only has not been reverted but has been extended. I don't think the current approach makes sense as ArrayChunk#reduce is not part of the IReduce/IReduceInit contract but of the IChunk contract and changing the behaviour to be IReduce-like in its handling of reduced introduces the burden of having to use preserve-reduced on the reducing function to no apparent benefit.

Given that the preserve-reduced is done on the clojure side, it seems to me like directly invoking .reduce rather than routing through internal-reduce should be broken but I haven't tested it.

Nicola Mometto
added a comment - 09/Dec/14 8:44 AM - edited Alex, while this is still a work in progress, I see that the change on ArrayChunk#reduce from previous WIP patches not only has not been reverted but has been extended. I don't think the current approach makes sense as ArrayChunk#reduce is not part of the IReduce/IReduceInit contract but of the IChunk contract and changing the behaviour to be IReduce-like in its handling of reduced introduces the burden of having to use preserve-reduced on the reducing function to no apparent benefit.
Given that the preserve-reduced is done on the clojure side, it seems to me like directly invoking .reduce rather than routing through internal-reduce should be broken but I haven't tested it.

Alex Miller
added a comment - 09/Dec/14 9:49 AM That's the work in progress part - I haven't looked at yet. I have not extended or done any work re ArrayChunk, just carried through what was on the prior patch. I'll be working on it again tomorrow.

quick review of clj-1515-2
1) withMeta gives the newly formed object the wrong meta.
2) LongRange/create() is the new 0-arity constructor for range, which sets the 'end' to Double/POSITIVE_INFINITY cast as a long. Current core uses Double/POSITIVE_INFINITY directly. Not sure how many programs rely upon iterating that far, or how they would break.
3) Relatedly, depending on the previous point: Because only all-long arguments receive chunking, the very common case of (range) with no args would be unchunked. Doesn't seem like too much of a stretch to add chunking to the other impl.
4) Though the commented invariants say that Range is never empty, the implementation uses a magic value of _count == 0 to mean not cached, which is surprising to me. hashcodes have the magic value of -1
5) s/instanceof Reduced/RT.isReduced
6) is the overflow behavior of "int count()" correct?

Ghadi Shayban
added a comment - 10/Dec/14 11:14 PM I am impressed and have learned a ton through this exercise.
quick review of clj-1515-2
1) withMeta gives the newly formed object the wrong meta.
2) LongRange/create() is the new 0-arity constructor for range, which sets the 'end' to Double/POSITIVE_INFINITY cast as a long. Current core uses Double/POSITIVE_INFINITY directly. Not sure how many programs rely upon iterating that far, or how they would break.
3) Relatedly, depending on the previous point: Because only all-long arguments receive chunking, the very common case of (range) with no args would be unchunked. Doesn't seem like too much of a stretch to add chunking to the other impl.
4) Though the commented invariants say that Range is never empty, the implementation uses a magic value of _count == 0 to mean not cached, which is surprising to me. hashcodes have the magic value of -1
5) s/instanceof Reduced/RT.isReduced
6) is the overflow behavior of "int count()" correct?

1) agreed!
2) Good point. I am definitely changing behavior on this (max of 9223372036854775807). I will look at whether this can be handled without affecting perf. Really, handling an infinite end point is not compatible with several things in LongRange.
3) I actually did implement chunking for the general Range and found it was slower (the original Clojure chunking is faster). LongRange is making up for that difference with improved primitive numerics.
4) Since empty is invalid, 0 and -1 are equally invalid. But I agree -1 conveys the intent better.
5) agreed
6) probably not. ties into 2/3.

Alex Miller
added a comment - 11/Dec/14 12:06 AM 1) agreed!
2) Good point. I am definitely changing behavior on this (max of 9223372036854775807). I will look at whether this can be handled without affecting perf. Really, handling an infinite end point is not compatible with several things in LongRange.
3) I actually did implement chunking for the general Range and found it was slower (the original Clojure chunking is faster). LongRange is making up for that difference with improved primitive numerics.
4) Since empty is invalid, 0 and -1 are equally invalid. But I agree -1 conveys the intent better.
5) agreed
6) probably not. ties into 2/3.
Thanks for this, will address.

Ghadi Shayban
added a comment - 29/Dec/14 12:29 PM Seems like a missing benchmark is (range 0 31) without the doall. Current patch (-9) allocates the chunk buffer at range's construction time. Maybe this can be delayed?

Ghadi - you are right. I reworked the patch (new -10 version) so that the chunk is only created on demand. Basically, the chunking is only used when traversing via chunked seqs and in normal seq iteration or reduce, no explicit chunks are created. That improved several timings. I also added the bench for (range 0 31) which was greatly improved by lazily creating the chunk, so good point on that.

Alex Miller
added a comment - 02/Jan/15 11:24 AM Ghadi - you are right. I reworked the patch (new -10 version) so that the chunk is only created on demand. Basically, the chunking is only used when traversing via chunked seqs and in normal seq iteration or reduce, no explicit chunks are created. That improved several timings. I also added the bench for (range 0 31) which was greatly improved by lazily creating the chunk, so good point on that.

I'm looking at the implementation of equiv() and wondering if it's worth checking whether the other object is also a reified range and comparing the private parameters rather than iterating through the sequences.

Michael Blume
added a comment - 03/Jan/15 1:38 PM I'm looking at the implementation of equiv() and wondering if it's worth checking whether the other object is also a reified range and comparing the private parameters rather than iterating through the sequences.

Attaching a simplified implementation as a deftype. I benchmarked a billion variants and dumped bytecode. I'm attaching the best performing combination, and it beats out the Java one in most cases.

Approach:
Two deftypes, one specialized to primitive longs, the other generic math.
Implement: Seqable/Counted,IReduce,IReduceInit,Iterable
Also, marker interfaces Sequential and Serialized.

The implementations are very straightforward, but Seqable deserves mention.
'seq' delegates its implementation to the existing lazy-seq based range, which has been stripped of the strange corner cases and now only handles strictly ascending/descending ranges. It has been renamed range*, rather than range1 because all the arguments to range are already numbers. core references to range have been accordingly renamed.

The new range constructor is loaded after reduce & protocols load. It checks types, and delegates to either long-range or generic-range, which handle the wacky argument cases that are not strictly ascending or descending. I'm annoyed with the structural duplication of those conditionals, not sure how to solve it.

Inside the LongRange implementation of IReduce/IReduceInit, boxing is carefully controlled, and was verified through bytecode dumping.

I elaborated the benchmarks for comparison, and also included benchmarking without type specialization.

You discover strange things working on this stuff. Turns out having the comparator close over the 'end' field is beneficial: #(< % end).

Alex, I figured out why the Java versions had a nearly exactly 2x regression on (doall (range 0 31)), and I attached a change to 'dorun'. Instead of calling (seq coll) then (next coll), it effectively calls (next (seq coll)). I think the implementation assumes that calling 'seq' is evaluating the thunk inside LazySeq. This should also help other Seqable impls like Cycle/Iterate/Repeat CLJ-1603.

Results (criterium full runs):

code

1.7.0-alpha5

clj-1515-10.patch (Java)

deftype specialized (attached)

deftype unspecialized

(count (filter odd? (take (* 1024 1024) (range))))

187.30 ms

194.92 ms

182.53 ms

184.13 ms

(transduce (take (* 1024 1024)) + (range))

58.27 ms

89.37 ms

84.69 ms

84.72 ms

(count (range (* 1024 1024)))

62.34 ms

27.11 ns

not run

not run

(reduce + (map inc (range (* 1024 1024))))

57.63 ms

46.14 ms

46.17 ms

52.94 ms

(reduce + (map inc (map inc (range (* 1024 1024)))))

82.83 ms

68.66 ms

64.36 ms

71.76 ms

(count (keep odd? (range (* 1024 1024))))

76.09 ms

57.59 ms

not run

not run

(transduce (comp (map inc) (map inc)) + (range (* 1024 1024)))

52.17 ms

39.71 ms

28.83 ms

42.23 ms

(reduce + 0 (range (* 2048 1024)))

85.93 ms

38.03 ms

26.49 ms

42.43 ms

(doall (range 0 31))

1.33 µs

2.89 µs

1.03 µs

1.08 µs

(doall (range 0 32))

1.35 µs

2.97 µs

1.07 µs

1.10 µs

(doall (range 0 128))

5.27 µs

11.93 µs

4.19 µs

4.29 µs

(doall (range 0 512))

21.66 µs

47.33 µs

16.95 µs

17.66 µs

(doall (range 0 4096))

171.30 µs

378.52 µs

135.45 µs

140.27 µs

(into [] (map inc (range 31)))

1.97 µs

1.57 µs

1.87 µs

1.95 µs

(into [] (map inc) (range 31))

1.66 µs

824.67 ns

891.90 ns

1.11 µs

(into [] (range 128))

5.11 µs

2.21 µs

2.43 µs

3.36 µs

(doall (range 1/2 1000 1/3))

1.53 ms

1.67 ms

1.59 ms

1.52 ms

(doall (range 0.5 1000 0.33))

164.83 µs

382.38 µs

149.38 µs

141.21 µs

(into [] (range 1/2 1000 1/3))

1.53 ms

1.40 ms

1.43 ms

1.50 ms

(into [] (range 0.5 1000 0.33))

157.44 µs

108.27 µs

104.18 µs

127.20 µs

Open Questions for screeners of the deftype patch:

1) What to do about the autopromotion at the Long/MAX_VALUE boundary? I've preserved the current behavior of Clojure 1.6.
2) Alex, I did not pull forward the filter chunk tweak you discovered
3) Is the structural duplication of the conditionals in generic-range long-range awful?
4) Are there any other missing interfaces? IMeta comes to mind. Not sure about IHashEq either.

Ghadi Shayban
added a comment - 18/Jan/15 7:26 PM Attaching a simplified implementation as a deftype. I benchmarked a billion variants and dumped bytecode. I'm attaching the best performing combination, and it beats out the Java one in most cases.
Approach:
Two deftypes, one specialized to primitive longs, the other generic math.
Implement: Seqable/Counted,IReduce,IReduceInit,Iterable
Also, marker interfaces Sequential and Serialized.
The implementations are very straightforward, but Seqable deserves mention.
'seq' delegates its implementation to the existing lazy-seq based range, which has been stripped of the strange corner cases and now only handles strictly ascending/descending ranges. It has been renamed range*, rather than range1 because all the arguments to range are already numbers. core references to range have been accordingly renamed.
The new range constructor is loaded after reduce & protocols load. It checks types, and delegates to either long-range or generic-range, which handle the wacky argument cases that are not strictly ascending or descending. I'm annoyed with the structural duplication of those conditionals, not sure how to solve it.
Inside the LongRange implementation of IReduce/IReduceInit, boxing is carefully controlled, and was verified through bytecode dumping.
I elaborated the benchmarks for comparison, and also included benchmarking without type specialization.
You discover strange things working on this stuff. Turns out having the comparator close over the 'end' field is beneficial: #(< % end).
Alex, I figured out why the Java versions had a nearly exactly 2x regression on (doall (range 0 31)), and I attached a change to 'dorun'. Instead of calling (seq coll) then (next coll), it effectively calls (next (seq coll)). I think the implementation assumes that calling 'seq' is evaluating the thunk inside LazySeq. This should also help other Seqable impls like Cycle/Iterate/Repeat CLJ-1603.
Results (criterium full runs):

code

1.7.0-alpha5

clj-1515-10.patch (Java)

deftype specialized (attached)

deftype unspecialized

(count (filter odd? (take (* 1024 1024) (range))))

187.30 ms

194.92 ms

182.53 ms

184.13 ms

(transduce (take (* 1024 1024)) + (range))

58.27 ms

89.37 ms

84.69 ms

84.72 ms

(count (range (* 1024 1024)))

62.34 ms

27.11 ns

not run

not run

(reduce + (map inc (range (* 1024 1024))))

57.63 ms

46.14 ms

46.17 ms

52.94 ms

(reduce + (map inc (map inc (range (* 1024 1024)))))

82.83 ms

68.66 ms

64.36 ms

71.76 ms

(count (keep odd? (range (* 1024 1024))))

76.09 ms

57.59 ms

not run

not run

(transduce (comp (map inc) (map inc)) + (range (* 1024 1024)))

52.17 ms

39.71 ms

28.83 ms

42.23 ms

(reduce + 0 (range (* 2048 1024)))

85.93 ms

38.03 ms

26.49 ms

42.43 ms

(doall (range 0 31))

1.33 µs

2.89 µs

1.03 µs

1.08 µs

(doall (range 0 32))

1.35 µs

2.97 µs

1.07 µs

1.10 µs

(doall (range 0 128))

5.27 µs

11.93 µs

4.19 µs

4.29 µs

(doall (range 0 512))

21.66 µs

47.33 µs

16.95 µs

17.66 µs

(doall (range 0 4096))

171.30 µs

378.52 µs

135.45 µs

140.27 µs

(into [] (map inc (range 31)))

1.97 µs

1.57 µs

1.87 µs

1.95 µs

(into [] (map inc) (range 31))

1.66 µs

824.67 ns

891.90 ns

1.11 µs

(into [] (range 128))

5.11 µs

2.21 µs

2.43 µs

3.36 µs

(doall (range 1/2 1000 1/3))

1.53 ms

1.67 ms

1.59 ms

1.52 ms

(doall (range 0.5 1000 0.33))

164.83 µs

382.38 µs

149.38 µs

141.21 µs

(into [] (range 1/2 1000 1/3))

1.53 ms

1.40 ms

1.43 ms

1.50 ms

(into [] (range 0.5 1000 0.33))

157.44 µs

108.27 µs

104.18 µs

127.20 µs

Open Questions for screeners of the deftype patch:
1) What to do about the autopromotion at the Long/MAX_VALUE boundary? I've preserved the current behavior of Clojure 1.6.
2) Alex, I did not pull forward the filter chunk tweak you discovered
3) Is the structural duplication of the conditionals in generic-range long-range awful?
4) Are there any other missing interfaces? IMeta comes to mind. Not sure about IHashEq either.

note with the deftype patch, the transduce over infinite (range) case is the only one where 'master' currently performs best. This is because the implementation was changed to (iterate inc' 0). When CLJ-1603 is applied that situation should improve better.

Ghadi Shayban
added a comment - 18/Jan/15 7:30 PM note with the deftype patch, the transduce over infinite (range) case is the only one where 'master' currently performs best. This is because the implementation was changed to (iterate inc' 0). When CLJ-1603 is applied that situation should improve better.

new patch CLJ-1515-deftype-nostructural-dup.patch with the silly conditional duplication removed. Updated benchmarks including 'count' impls. These benchmarks run on a different machine with the same hardness. Blue ribbon.

Ghadi Shayban
added a comment - 18/Jan/15 11:05 PM - edited new patch CLJ-1515-deftype-nostructural-dup.patch with the silly conditional duplication removed. Updated benchmarks including 'count' impls. These benchmarks run on a different machine with the same hardness. Blue ribbon.

This is looking very good and I think we should move forward with it as the preferred approach. Feel free to update the description appropriately. I'll file a separate ticket with the filter tweak. Some comments on the patch:

1) GenericRange/count - this math is broken if you start to mix infinity in there. I think just (count (seq this)) is safer.
2) GenericRange/iterable - I think if we are IReduceInit and Seqable, we can omit this. I had it in the Java one because I inherited Iterable via ASeq but that's not an issue in this impl.
3) LongRange/reduce - why Long/valueOf? Isn't start a long field?
4) LongRange/iterable - ditto #2
5) print-method - anytime I see print-method, there should probably be print-dup too. For example, this is broken: (binding [*print-dup* true] (println (range 0 10)))
6) Is serialization broken by this patch? Can you justify the test changes?

Alex Miller
added a comment - 19/Jan/15 8:32 AM This is looking very good and I think we should move forward with it as the preferred approach. Feel free to update the description appropriately. I'll file a separate ticket with the filter tweak. Some comments on the patch:
1) GenericRange/count - this math is broken if you start to mix infinity in there. I think just (count (seq this)) is safer.
2) GenericRange/iterable - I think if we are IReduceInit and Seqable, we can omit this. I had it in the Java one because I inherited Iterable via ASeq but that's not an issue in this impl.
3) LongRange/reduce - why Long/valueOf? Isn't start a long field?
4) LongRange/iterable - ditto #2
5) print-method - anytime I see print-method, there should probably be print-dup too. For example, this is broken: (binding [*print-dup* true] (println (range 0 10)))
6) Is serialization broken by this patch? Can you justify the test changes?

These new types are merely seqable, so (not= (LongRange. 0 10 1) (seq (LongRange. 0 10 1)))
Not sure how to handle this 100%. Nothing precludes the LongRange itself from roundtripping, but just that calling seq on it returns a separate object.

1) Instead of extending IReduce and IReduceInit, just extend IReduce and implement both arities (IReduce extends IReduceInit).
2) I'm slightly troubled by the .invokePrim now. Did you look at a macro that does prim type-hinting maybe?
3) On the serialization thing, is the problem really with serialization or with equality? The test that's failing is (= v rt). Because these are not IPersistentCollections, pcequiv won't be used and it's just calling LongRange.equals(), which is not implemented and falls back to identity, right? Probably need to implement the hashCode and equals stuff. Might need IHashEq too.

Alex Miller
added a comment - 23/Jan/15 9:57 AM More comments...
1) Instead of extending IReduce and IReduceInit, just extend IReduce and implement both arities (IReduce extends IReduceInit).
2) I'm slightly troubled by the .invokePrim now. Did you look at a macro that does prim type-hinting maybe?
3) On the serialization thing, is the problem really with serialization or with equality? The test that's failing is (= v rt). Because these are not IPersistentCollections, pcequiv won't be used and it's just calling LongRange.equals(), which is not implemented and falls back to identity, right? Probably need to implement the hashCode and equals stuff. Might need IHashEq too.

Cause: Might be irrelevant in the face of CLJ-1375. Util/equiv for maps doesn't use IPC/equiv unless the collection is also java.util.Collection or Map. If it is then it checks for IPersistentCollection. I added a check for IPersistentCollection first.

Ghadi Shayban
added a comment - 23/Jan/15 12:13 PM Took care of 1) and 3). Punting on hiding the invokePrim behind a macro. It may be shameful but it works really fast.
Another case found:
(assoc {'(0 1 2 3 4) :foo} (range 5) :bar) needs to properly overwrite keys in the map.
Cause: Might be irrelevant in the face of CLJ-1375. Util/equiv for maps doesn't use IPC/equiv unless the collection is also java.util.Collection or Map. If it is then it checks for IPersistentCollection. I added a check for IPersistentCollection first.
https://github.com/ghadishayban/clojure/commits/for-screening
To get hasheq working, I added back the iterators for use by Murmur3/hashOrdered.

Will add pronto but maybe someone can clarify something: Adding a simple clojure.lang.IObj/withMeta to the deftype results in an AbstractMethodError when trying to print a range. This is because the impl of vary-meta used in the default printer assumes that if all IObjs are also IMetas. Seems like a problem with either vary-meta's impl or the interface separation.

I can fix by:
1) adding a _meta field to Ranges and handling IMeta as well as IObj.

Ghadi Shayban
added a comment - 20/Feb/15 12:09 PM Will add pronto but maybe someone can clarify something: Adding a simple clojure.lang.IObj/withMeta to the deftype results in an AbstractMethodError when trying to print a range. This is because the impl of vary-meta used in the default printer assumes that if all IObjs are also IMetas. Seems like a problem with either vary-meta's impl or the interface separation.
I can fix by:
1) adding a _meta field to Ranges and handling IMeta as well as IObj.

Fogus
added a comment - 27/Mar/15 1:57 PM Screened. Though the amount of duplicated code saddens me, I don't hold it against the implementer. It's a fairly straight-forward counting Impl with a chunk cache.