Nicola Mometto
added a comment - 14/Feb/14 5:46 PM I want to point out that my patch breaks ABI compatibility.
A possible approach to avoid this would be to have 3 constructors instead of 2, I can write the patch to support this if desired.

The patch 0001-CLJ-1224-cache-hasheq-and-hashCode-for-records.patch is broken in at least two ways:

The fields __hash and __hasheq are adopted by new records created by .assoc and .without, which will cause those records to have incorrect (and likely colliding) hash values

The addition of the new fields breaks the promise of defrecord, which includes an N+2 constructor taking meta and extmap. With the patch, defrecords get an N+4 constructor letting callers pick hash codes.

Stuart Halloway
added a comment - 27/Jun/14 11:09 AM The patch 0001-CLJ-1224-cache-hasheq-and-hashCode-for-records.patch is broken in at least two ways:

The fields __hash and __hasheq are adopted by new records created by .assoc and .without, which will cause those records to have incorrect (and likely colliding) hash values

The addition of the new fields breaks the promise of defrecord, which includes an N+2 constructor taking meta and extmap. With the patch, defrecords get an N+4 constructor letting callers pick hash codes.

Andy Fingerhut
added a comment - 29/Aug/14 4:32 PM All patches dated Jun 7 2014 and earlier no longer applied cleanly to latest master after some commits were made to Clojure on Aug 29, 2014. They did apply cleanly before that day.
I have not checked how easy or difficult it might be to update this patch.

1) hash and hasheq are stored as Objects, which seems kind of gross. It would be much better to store them as primitive longs and check whether they'd been calculated by comparing to 0. We still end up with a long -> int conversion but at least we'd avoid boxing.

2) assoc wrongly copies over the __hash and __hasheq to the new instance:

Nicola, I'm happy to let you continue to do dev on this patch with me doing the screening if you have time. Or if you don't have time, let me know and I (or someone else) can work on the dev parts. Would like to get this prepped and clean for 1.8.

Alex Miller
added a comment - 16/Jun/15 4:06 PM 1) hash and hasheq are stored as Objects, which seems kind of gross. It would be much better to store them as primitive longs and check whether they'd been calculated by comparing to 0. We still end up with a long -> int conversion but at least we'd avoid boxing.
2) assoc wrongly copies over the __hash and __hasheq to the new instance:

4) Needs some perf tests with a handful of example records (at least: 1 field, many fields, extmap, etc).
Nicola, I'm happy to let you continue to do dev on this patch with me doing the screening if you have time. Or if you don't have time, let me know and I (or someone else) can work on the dev parts. Would like to get this prepped and clean for 1.8.

Nicola Mometto
added a comment - 16/Jun/15 5:59 PM - edited Alex, regarding point 1, I stored __hash and __hasheq as ints rather than longs and compared to -1 rather than 0, to be consistent with how it's done elsewhere in the clojure impl

1) the -1 there is a long and requires an upcast of the private field from int to long. I'm sure that's not a big deal, but wish there was a way to avoid it. I didn't try it but maybe (int -1) would help the compiler out?

Alex Miller
added a comment - 17/Jun/15 11:39 AM Looking at the bytecode for hashcode and hasheq, I had two questions:
1) the -1 there is a long and requires an upcast of the private field from int to long. I'm sure that's not a big deal, but wish there was a way to avoid it. I didn't try it but maybe (int -1) would help the compiler out?
2) I wonder whether something like this would perform better:

1- there's no Numbers.equiv(int, int) so even casting -1 to an int wouldn't solve this. a cast is always necessary. if we were to make hasheq a long, we'd need l2i in the return path, making hasheq an int we need an i2l in the comparison.
2- that doesn't remove any casting, it just replaces a load from the local variable stack with a field load:

Nicola Mometto
added a comment - 17/Jun/15 11:54 AM 1- there's no Numbers.equiv(int, int) so even casting -1 to an int wouldn't solve this. a cast is always necessary. if we were to make hasheq a long, we'd need l2i in the return path, making hasheq an int we need an i2l in the comparison.
2- that doesn't remove any casting, it just replaces a load from the local variable stack with a field load:

Note that using -1 for the uncomputed hash value can cause issues with transient lazily computed hash codes on serialization (CLJ-1766). In this case, the defrecord cached code is not transient so I don't think it's a problem, but something to be aware of. Using 0 would avoid this potential issue.

Alex Miller
added a comment - 22/Jun/15 8:38 AM Note that using -1 for the uncomputed hash value can cause issues with transient lazily computed hash codes on serialization (CLJ-1766). In this case, the defrecord cached code is not transient so I don't think it's a problem, but something to be aware of. Using 0 would avoid this potential issue.

Nicola Mometto
added a comment - 17/Jul/15 7:50 PM Updated patch so that it applies on current master and changed the default hash value from -1 to 0.
Rich, we still need initialization since all the record ctors delegate to the ctor arity with explicit __hash and __hasheq, following the approach of the alt ctors for __extmap and __meta

Alex Miller
added a comment - 28/Jul/15 6:17 PM Hey Nicola, two comments on the hasheq/hashcode impl:
1) I don't think there's any reason to use == in the check instead of =, and = seems better (I think the resulting bytecode is same either way though).
2) The generated bytecode in these cases will call getfield twice in the cached case (once for the check, and once for the return):