clj-1270-1.txt is just a quick swipe at changing the printing behavior for Java objects with classes implementing the java.util.Queue interface, so pr/pr-str shows them similarly to how they show Clojure lists and seqs. java.util.Set and java.util.List already pr/pr-str similarly to Clojure lists, vectors, or sets.

No change is made in this patch to the behavior of print/print-str, or to how other objects are printed whose classes implement java.util.Collection, but not also one of java.util.Set, .List, or .Queue.

A few tests are added to verify the desired behavior when print-length is set to a value other than nil.

Problem: I was trying to compile a project that has some native dependencies (using clojure-maven-plugin version 1.3.13 with Maven 2.0). I forgot to set java.library.path properly so that the native library could be found, and only got an error of

Cause: Compile.java flushes and closes RT.out in the finally block. When out is closed it is unable to write out the stack trace for the UnsatisfiedLinkError that was being thrown. This made it very difficult to debug what was happening.

Solution: Flush, but do not close RT.out in Compile/main. Test is included, but may or may not be worth keeping.

Updated description a bit and added a patch. The code change is just to stop closing RT.out in Compile.main. The test to invoke Compile and check the stream has an annoying amount of setup and teardown so it's a bit messy. Would love to have more io facilities available (with managed cleanup)!

An issue I brought up in the email thread is consistency: vector binding works with anything nthable, map binding works with anything associative. With my current patch (empty-list-destructuring-CLJ-1140-12.30.12.diff), only ISeqs are supported for kwarg map binding.

I'd prefer it work with anything seqable, and can provide a patch that does that. I would go ahead and do so, but now that this ticket is now Approval: OK, I didn't want to alter what had been OK'ed. Let me know if you want a patch that adds support for anything seqable.

The 'ns macro is not as dynamic as it could be.
For instance, the following line typed in a repl (ns a)(ns b (:require a)) currently (1.4, 1.3, etc.) fails with an exception because the (:require a) call tries to reach the filesystem for file a.clj or a__init.class.

The attached patch ( dynamic-ns-patch2.diff ) allows a successful call to (ns a) behave the same as a successful call to (require 'a), adding namespace a to the list of loaded-libs.

On further thought and comparison to APersistentMap, I'm not sure if that's the best use of Util.hasheq(). I can't find good reference on the semantic differences and am not familiar enough with the Clojure source to infer it, yet.

The Set equality code currently in Clojure uses .hashCode to quickly bail out of comparisons in both .equals and .equiv. This feels wrong since .equals and .equiv presume different hashes. The map equality code avoids the problem by not checking any hash.

Aaron's 2/13 patch makes set equality work like map equality, not checking the hash. (I think a much shorter patch that accomplishes the same thing merely drops the call to hash.)

It seems to me a separate problem that both hash and set are broken for Java .equals rules, because of the equiv-based handling of their keys.

Currently, reduce-kv and fold throw when used on nil, because their respective protocols don't extend to nil. This seems strange, since Clojure tends to handle nils gracefully where possible, especially in places where collections are expected.

I created the patch for CLJ-870 that added the function re-quote-replacement before Clojure 1.4 was released, and I was apparently hoping it would get into that release. It is in now, but incorrectly states that fn re-quote-replacement was added in 1.4.

And before creating this patch, I did a diff between Clojure 1.4 and 1.5-beta1 source code, and verified that every other function with :added metadata that has been added since 1.4 says "1.5". Only re-quote-replacement was mislabeled in this way.

changes-draft-v1.md is not a patch, but an outline of a proposed new changes.md file for Clojure 1.5 with some of the content fleshed out. It still has lots of occurrences of "TBD" for "To Be Documented" in it.

It does mention every ticket that had a patch committed since Clojure 1.4.0 until Nov 15 2012.

I am hoping someone who is more knowledgeable of the "TBD" parts takes this and runs with it.

I'll put up an updated version soon, but that headline wasn't properly updated to match the later occurrence of it in the body, which is: "Clojure 1.5 reducers library requires Java 6 or later". Is it true that the ForkJoin library requires Java 6 or later? If not, how can it be made to work with Java 5?

changes-draft-v8.md, section 2.4, needs an update for description of some->. The document says "logical true" where it should say "not nil". Same comment applies to some->>. The point is that false will thread through the forms. This is a change from the replaced when-> (in 1.5-beta1).

I find the following sentence a little vague: "Clojure 1.5 can still be compiled and run with Java 5, but the test suite will not pass due to the lack of support for ForkJoin."

The problem is the application of the "but" to both parts of the conjunction. Isn't the intent actually: "Clojure can run with Java version 1.5 or later. Running the Clojure build requires 1.6, in order to test features that work with ForkJoin."

Stuart H, that is almost the intent, and probably close enough for this kind of documentation.

The full gory details are as follows:

Clojure can build and run with Java version 1.5 or later. A Java version 1.5 build of Clojure only works if you do not run the test suite, e.g. by using the command "ant jar". To build Clojure including running the test suite, or to use the new reducers library, requires Java 1.6 or later.

If the patch for CLJ-1109 is applied, the full gory details become simpler to state:

Clojure can build and run with Java version 1.5 or later. Using the new reducers library requires Java 1.6 or later.

A number of vars from clojure.repl, clojure.java.javadoc, and clojure.pprint are unconditionally referred into *ns* by clojure.main/repl. This is fine when it is being used e.g. as the primary driver of a terminal-bound Clojure REPL, but other usages can end up bringing those utility vars into namespaces other than 'user. This can cause problems if clojure.main/repl is used to drive a REPL within namespaces that already have referred or interned vars with the same names as those utility vars, e.g.:

Worse, nREPL uses clojure.main/repl (in large part to maximize the consistency of REPL behaviour across different Clojure versions), where each user expression is evaluated through a separate clojure.main/repl invocation. This leads to the same problems as above, but for every nREPL user, session, and expression (reported @ NREPL-31).

@Kevin: It's not required, but I found it far more straightforward to not try to pretend that the underlying transport was stream-based when it's actually message-based. It also means that sessions can be very lightweight: unless code is being evaluated within a session, it is not occupying a thread, and takes up only as much space as its map of thread-local bindings.

It really is a pity the implementations of hashCode and hasheq are duplicated. I wonder how important it is to extend Obj? Regardless, that's the approach PersistentQueue was already taking, so changing that would be outside the scope of this ticket anyway.

I can't apply this patch with "git am", but "patch -p 1" works fine. I'm hoping this is a configuration problem on my end, so I'm marking this screened.

That's really a nasty problem. I wrote the code that makes it possible to compile clojure with either a JDK7 and native ForkJoin or an older JDK + jsr166y. Unfortunately, if you build clojure with a JDK7, reducers' fold requires a JDK7 at runtime, and if you build clojure with a JDK <7 + jsr166y, fold also requires jsr166y at runtime.

The essential problem is that clojure is AOT-compiled to class-files and those are put into the release jars. Ordinary clojure libraries are distributed as jar files containing clj files that are compiled when loaded. There, the implemented approach works just fine. For example, again your example with 1.5.0-alpha4:

This patch removes the clojure.core.reducers namespace from the namespaces to be AOT compiled. So now this namespace will be JIT-compiled when being required, and at that point either the JDK7 ForkJoin classes or the jsr166y classes need to be there.

I think that clj-1065-set-map-constructors-allow-duplicates-v1.txt dated Sep 7 2012 updates Clojure for the behavior Rich recommends on the dev Wiki page in the description. I may have missed something, though. Perhaps the most questionable part is the way I've chosen to implement array-map always taking the last key's value. It is no less efficient than the PersistentArrayMap createWithCheck method, i.e. O(n^2).

array-map is ok. I would prefer that the docs strings say "as if by repeated assoc" (or conj for sets). "eliminates" and "last" are less precise e.g. what if you pass equal things, but with different metadata, to hash-set? "Eliminates dupes" doesn't say.

Also, his mention of metadata and "as if by assoc" led me to beef up the new test cases to check that metadata is correct, and I thus found that my new array-map constructor was not doing the right thing. This one does.

There is still no implementation of Rich's #3 yet. Just wanted to get this one up there in case someone else builds on it before I do.

Patch clj-1065-do-map-constant-duplicate-key-checks-compile-time-only-v1.txt only makes changes that should eliminate run-time checks for duplicate map keys, if they are compile-time constants.

It is currently separate from the changes to those for the set/map constructor functions, since I am less sure about these changes. I'm not terribly familiar with the compiler internals, and I might be going about this the wrong way. Better to get separate feedback on these changes alone. I'm happy to merge them all into one patch if both parts look good.

The when-first macro will evaluate the xs expression twice. Admittedly, it does exactly what the doc string says, but that seems undesirable to me. Even without side effects, there's a potential performance issue if xs is some expensive operation.

Patch clj-1052-assoc-should-throw-exc-if-missing-val-patch1.txt dated Aug 29 2012 makes assoc throw an IllegalArgumentException if more than one key is given, but the last key has no value. It includes a few simple test cases with correct and incorrect arguments to assoc.

The patch appears correct. It does introduce a single extra (next) per iteration into the success path, although that seems unlikely to dominate the work. Wouldn't hurt to add as assessment showing this is no slower for correct programs.

(average of 3 times after patch) / (average of 3 times before patch) = 1.0240

So 2.4% slowdown on average for that test case. I should add that I'm not a statistician, but note that this 2.4% difference is less than the variation in run time from one run to the next of the same experiment. Likely any real statistician would recommend collecting a lot more data before asserting there is a change in performance.

clj-1052-assoc-should-throw-exc-if-missing-val-patch3.txt dated Oct 5 2012 is the same as the previous patch clj-1052-assoc-should-throw-exc-if-missing-val-patch2.txt dated Aug 29 2012 except some context lines have been updated so that it applies cleanly using git. The older version will be removed in a minute.

Note that this patch also removes the cyclic load tests. Those tests relied on leaving broken code on the test classpath, which is very hostile to tooling (in this case the test runner's ability to walk the code in the test directory.)

I believe such tests are an antipattern, and should be implemented differently or e.g. placed in an ancillary test suite.

I agree with you position on the cyclic load tests. The patch looks sane to me and ran without hitch. A couple points of note:

This requires one to re-run antsetup – not a problem, but it might trip a few people

The final output of the test run is different than before. In the case of a successful run we now see only the generative success stats. Is the regular clojure.test stats included? It's not clear to me, but I don't think they are. In the fail case we see something like the following. I actually like the map output better, but one advantage to the stack trace output was that it sometimes helped identify code bugs. In any case, it would be nice to see an aggregate score at the end. This is a test.generative feature request I know.

The stats reports are aggregate, i.e. they rollup both the generative and clojure.test stats. It would be straightforward to separate these in the report. I will look at doing that in the next drop of c.t.g.

The output already includes stack traces for errors, but not for test failures. I thought this was consistent with clojure.test, but maybe I missed something.

The current implementation is dereferencing Reduced and returning them up the chain; but that means that parent nodes don't know the reduction has completed, so the reduction might continue along other paths down the tree.

This patch touches the same areas of code as CLJ-1040 so will have conflicts if both patches are merged separately. However, as noted in my comment on CLJ-1040, I don't think the patch there is correct yet, so I've also included a commit (7beb14256) to fix the issue in CLJ-1040. Thus, this patch can be merged independently of 1040's patch, solving both issues without a merge conflict.

This patch also includes tests for both issues, which fail before my commits are applied, and pass afterwards.

Because a new (fill) action is created every time an item is (drain)ed, the LBQ gets loaded up with piles of EOS markers. If the output seq is consumed faster than the input sequence can produce items, then the original agent action does all of the filling in a single action, while the agent's queue continues to back up with (fill) actions. Eventually the entire sequence is consumed, and each queued action does a blocking .put of an EOS marker; once the queue has filled up, one of these actions blocks forever, since nobody will ever .take from the queue again.

The attached patch does two things:

Make sure to put exactly one EOS marker in the stream, by setting the agent's state to nil if and only if a .offer of EOS has been accepted.

Replace all occurrences of .put with .offer (and appropriate checking of the return value). This is necessary because if .put ever blocks, it's possible for the system to remain in that state forever (say, because the client stops consuming the queue), thus leading to the same leaked-thread scenario.

Sorry for me just triggering on the function "seque" without a deep understanding, but is this by any chance the same issue as described in CLJ-823? If so, since this one has a patch and that one doesn't, might be nice to mark CLJ-823 as a duplicate of this one.

No, these are separate issues. CLJ-823 is just a special case of the general problem that seque is not usable from within an agent action. I have an implementation of seque using futures instead of agents so that this isn't a problem, but that has other problems of its own, specifically if you don't fully consume the seque you wind up leaking a future object.

First of all, clojure.lang.IFn is no protocol but an interface. And it does not declare a

public Object invoke(Object... obs)

method. It has an `invoke` method with 20 Object parameters followed by an Object... parameter, but to give an implementation for that, you have to specify every parameter separately, and the last Object... arg is just a normal argument that must be an Object[]. That's because Java-varags Type... parameters are just Java syntax sugar, but in the byte-code its simply a Type-array.

What your example does is provide an `invoke` implementation for the 2-args version, where the first parameter happens to be named `&`, which has no special meaning here. Take that example:

> The fact that the meaning of & gets 'subverted' makes the issue only (slightly) worse, in my opinion.

I agree. I'll attach a patch which checks for those invalid usages soon.

> Just for the record, destructuring seems to work, at least for interface impls.

Could you please provide a complete example demonstrating your statement?

I'm rather sure that varargs and destructuring don't work for any of defprotocol, definterface, deftype, defrecord, and reify. But you can use varargs and destructuring when providing dynamic implementations via `extend` (or `extend-protocol`, `extend-type`), because those impls are real functions in contrast to methods.

compiles just fine, and & is interpreted as a usual argument that happens to be
named & without special meaning. But clearly, the user wanted to specify a
varags parameter here. The same applies to definterface.

compiles just fine, and & is interpreted as a usual argument that happens to be
named & without special meaning. But clearly, the user wanted to specify a
varags parameter here. The same applies to definterface.

Tassilo, with your patch 0001-CLJ-1024-Check-for-invalid-varags-destrucuring-uses.patch dated July 12, 2012, I get the following error message while testing, apparently because some metadata is missing on the new functions your patch adds to core:

This looks ok to me, but it seems like a fair amount of duplication to accomplish the task. It seems like we should just be able to ask if it is ok to proceed, instead of having to pick which function needs to be called in what case.

clj-1019-ns-resolve-doc-string-misspelling-patch1.txt dated July 12, 2012 is identical to fix_ns-resolve.patch of June 30, 2012, except it is in git format. It includes Gabriel's name and email address for proper attribution.

Could you explain how you think the docstring definition implies this behavior? Maybe I'm missing something, but I was surprised. For instance, in the case of (range 3 9 0), if start were truly inclusive as the docstring says, the result would be (3), not ().

You mentioned that you think the infinite sequence of 0's is consistent and in keeping with the definition of range. I'm not sure I agree. (0,0] is an empty set of length zero, and [0,0) is an empty set of length zero.

You state that empty list only makes sense for start >= end, except this is not the current behavior. Could you help me understand what you believe the appropriate behavior would be in each of the following three cases? (range 0 10 0), (range 10 0 0), and (range 0 0 0)?

A few options to consider:
1) Fix the docstring to reflect the actual behavior of range.
2) Handle the case of (range 9 3 0) => (9 9 9 ...) to make it consistent with the behavior of (range 3 9 0) => ()
3) Stop allowing a step of zero altogether.

It does make sense NOT to allow a step of zero (at least to me)... I wasn't going to say anything about this but if other popular languages do not allow 0, then I guess it makes more sense than I originally gave myself credit for... However, if we want to be mathematically correct then the answer would be to return an infinte seq with the start value like below. After all, what is a step of 0? does it make any difference how many steps you take if with every step you cover 0 distance? obviously not...you start from x and you stay at x forever...we do have repeat for this sort of thing though...

It would be inconsistent to make 0 a special case, all of these are valid arithmetic progressions. And all of these are consistent with the current docstring.

If you make 0 a special case, then people will need to write special case code to handle it. Consider the code to create a multiplication table for example:

(for [x (range 10)]
(take 10 (range 0 Long/MAX_VALUE x)))

This works fine if you produce an infinite sequence of zeros for step 0, but fails if you create an empty list as a special case for step 0.

As a related issue, I'd personally also favour negative step sizes also producing an infinite sequence. If we don't want to allow this though, then at least the docstring should be changed to say "a lazy seq of non-decreasing nums" and a negative step should throw an error.

Is this behavior intentional for some historical reason I don't know about or understand?

I doubt it.

Has this been brought up before? I couldn't find any reference to it via Google.

Not that I know of.

Are there performance implications to my patch?

I doubt it. Lazy seqs are not ideal for high-performance code anyway.

Am I addressing a symptom rather than the problem?

I think the problem is that the result of `range` with a step of 0 was never specified. Don't assume that the tests are specifications. Many of the tests in Clojure were written by over-eager volunteers who defined the tests based on whatever the behavior happened to be. The only specification from the language designer is the docstring. By my reading, that means that `range` with a step of 0 should return an infinite seq of the start value, unless the start and end values are equal.

As someone who has to read these tickets, I'd really appreciate it if you could keep the description up to date and accurate, with examples of the current and intended behavior and the strategy to be used in fixing. I can't follow every thread to see what the latest thinking is, especially on a patch like this where the original mission was changed.

Attached patch makes partial just returns f if called with only one argument. Not sure if this will have a performance impact or not; a Clojure/core member would probably be able to better judge if the use cases outweigh any performance hit.

Philip, thanks for writing that patch. Could you please attach it as a file to this ticket, instead of copying and pasting it into a comment? Instructions are under the heading "Adding patches" on this page:

Patch clj-1009-print-table-org-mode-patch2.txt dated June 8, 2012 is the same as Stuart Halloway's print-table-org-mode.txt of the same date, except a couple of tests have been added. His code changes look correct to me.

clj-1009-print-table-org-mode-patch3.txt dated Aug 15, 2012 is same as the previous -patch2.txt version of Jun 8, 2012 (now deleted as obsolete), except it corrects the header separator as identified by Aaron Bedra.

Presumptuously changing approval from Incomplete back to Vetted, as it was before Aaron Bedra described a problem with the patch, and I updated the patch to address his comment. Feel free to apply the smackdown if necessary.

Sorry, I was too quick to react on the ML (someone said it was related to hasheq caching and since I had the patch almost ready: on a project I noticed too much time spent computing hasheq on vectors).
So no, my patch doesn't improve anything with kws, syms or strs as keys. However when keys are collections, it fares better.

In 1.4, for a "regular" object, it must fails two instanceof tests before calling .hashCode().
If we make keywords and symbols implement IHashEq and reverse the test order (first IHashEq, then Number) it should improve the performance of the above benchmark – except for Strings.

In 1.3, #'hash was going through Object.hashCode and thus was simple and fast. Plus collections hashes were cached.
In 1.4 and master, #'hash goes now through two instanceof test (Number and IHasheq in this order) before trying Object.hashCode in last resort. Plus collections hashes are not cached.
As such I'm not sure Util.hasheq inherent slowness (compared to Util.hash) and hasheq caching should be separated in two issues.

The caching-hasheq-v2.diff patchset reintroduces hashes caching for collections/hasheq and reorders the instanceof tests (to test for IHashEq before Number) and makes Keyword and Symbol implement IHashEq to branch fast in Util.hasheq.

The first patch implements a reducers equivalent of clojure.core/mapcat, and the second patch adds a new reducers.clj to the clojure tests that checks that map, mapcat, filter, and reduce are equivalent in clojure.core and clojure.core.reducers.

Here's a stab in the dark attempt at rewriting MultiFn to use atoms to swap between immutable copies of its otherwise mutable state.

The four pieces of the MultiFn's state that are mutable and protected by locks are now contained in the MultiFn.State class, which is immutable and contains convenience functions for creating a new one with one field changed. An instance of this class is now held in an atom in the MultiFn called state. Changes to any of these four members are now done with an atomic swap of these State objects.

The getMethod/findAndCacheBestMethod complex was rewritten to better enable the atomic logic. findAndCacheBestMethod was replaced with findBestMethod, which merely looks up the method; the caching logic was moved to getMethod so that it can be retried easily as part of the work that method does.

As it was findAndCacheBestMethod seemingly had the potential to cause a stack overflow in a perfect storm of heavy concurrent modification, since it calls itself recursively if it finds that the hierarchy has changed while it has done its work. This logic is now done in the CAS looping of getMethod, so hopefully that is not even an unlikely possibility anymore.

There is still seemingly a small race here, since the check is done of a regular variable against the value in a ref. Now as before, the ref could be updated just after you do the check, but before the MultiFn's state is updated. Of course, only the method lookup part of a MultiFn call was synchronized before; it could already change after the lookup but before the method itself executed, having a stale method finish seemingly after the method had been changed. Things are no different now in general, with the atom-based approach, so perhaps this race is not a big deal, as a stale value can't persist for long.

The patch passes the tests and Clojure and multimethods seems to work.

this patch gets rid of the ugly lock contention issues. I have not been able to benchmark it vs. the old multimethod implementation, but looking at it, I would not be surprised if it is faster when the system is in a steady state.

Obviously I didn't write the original code, so I'm not the ideal
person to explain this stuff. But I did work with it a bit recently,
so in the hopes that I can be helpful, I'm writing down my
understanding of the code as I worked with it. Since I came to the
code and sort of reverse engineered my way to this understanding,
hopefully laying this all out will make any mistakes or
misunderstandings I may have made easier to catch and correct. To
ensure some stability, I'll talk about the original MultiFn code as it
stands at this commit:https://github.com/clojure/clojure/blob/8fda34e4c77cac079b711da59d5fe49b74605553/src/jvm/clojure/lang/MultiFn.java

There are four pieces of state that change as the multimethod is either
populated with methods or called.

methodTable: A persistent map from a dispatch value (Object) to
a function (IFn). This is the obvious thing you think it is,
determining which dispatch values call which function.

preferTable: A persistent map from a dispatch value (Object) to
another value (Object), where the key "is preferred" to the value.

methodCache: A persistent map from a dispatch value (Object) to
function (IFn). By default, the methodCache is assigned the same
map value as the methodTable. If values are calculated out of the
hierarchy during a dispatch, the originating value and the
ultimate method it resolves to are inserted as additional items in
the methodCache so that subsequent calls can jump right to the
method without recursing down the hierarchy and preference table.

cachedHierarchy: An Object that refers to the hierarchy that is
reflected in the latest cached values. It is used to check if the
hierarchy has been updated since we last updated the cache. If it
has been updated, then the cache is flushed.

I think reset(), addMethod(), removeMethod(), preferMethod(),
prefers(), isA(), dominates(), and resetCache() are extremely
straightforward in both the original code and the patch. In the
original code, the first four of those are synchronized, and the other
four are only called from methods that are synchronized (or from
methods only called from methods that are synchronized).

Invoking a multimethod through its invoke() function will call
getFn(). getFn() will call the getMethod() function, which is
synchronized. This means any call of the multimethod will wait for and
take a lock as part of method invocation. The goal of the patch in
this issue is to remove this lock on calls into the multimethod. It in
fact removes the locks on all operations, and instead keeps its
internal mutable state by atomically swapping a private immutable
State object held in an Atom called state.

The biggest change in the patch is to the
getFn()>getMethod()>findAndCacheBestMethod() complex from the
original code. I'll describe how that code works first.

In the original code, getFn() does nothing but bounce through
getMethod(). getMethod() tries three times to find a method to call,
after checking that the cache is up to date and flushing it if it
isn't:

1. It checks if there's a method for the dispatch value in the
methodCache.

2. If not, it calls findAndCacheBestMethod() on the
dispatch value. findAndCacheBestMethod() does the following:

1. It iterates through every map entry in the method table,
keeping at each step the best entry it has found so far
(according to the hierarchy and preference tables).

2. If it did not find a suitable entry, it returns null.

3. Otherwise, it checks if the hierarchy has been changed since the cache
was last updated. If it has not changed, it inserts the method
into the cache and returns it. If it has been changed, it
resets the cache and calls itself recursively to repeat the process.

3. Failing all else, it will return the method for the default
dispatch value.

Again, remember everything in the list above happens in a call to a
synchronized function. Also note that as it is currently written,
findAndCacheBestMethod() uses recursion for iteration in a way that
grows the stack. This seems unlikely to cause a stack overflow unless
the multimethod is getting its hierarchy updated very rapidly for a
sustained period while someone else tries to call it. Nonetheless, the
hierarchy is held in an IRef that is updated independently of the
locking of the MultiFn. Finally, note that the multimethod is locked
only while the method is being found. Once it is found, the lock is
released and the method actually gets called afterwards without any
synchronization, meaning that by the time the method actually
executes, the multimethod may have already been modified in a way that
suggests a different method should have been called. Presumably this
is intended, understood, and not a big deal.

Moving on now to the patch in this issue. As mentioned, the main
change is updating this entire apparatus to work with a single atomic
swap to control concurrency. This means that all updates to the
multimethod's state have to happen at one instant in time. Where the
original code could make multiple changes to the state at different
times, knowing it was safely protected by an exclusive lock, rewriting
for atom swaps requires us to reorganize the code so that all updates
to the state happen at the same time with a single CAS.

To implement this change, I pulled the implicit looping logic from
findAndCacheBestMethod() up into getMethod() itself, and broke the
"findBestMethod" part into its own function, findBestMethod(), which
makes no update to any state while implementing the same
logic. getMethod() now has an explicit loop to avoid stack-consuming
recursion on retries. This infinite for loop covers all of the logic
in getMethod() and retries until a method is successfully found and a
CAS operation succeeds, or we determine that the method cannot be
found and we return the default dispatch value's implementation.

I'll now describe the operation of the logic in the for loop. The
first two steps in the loop correspond to things getMethod() does
"before" its looping construct in the original code, but we have to do
in the loop to get the latest values.

1. First we dereference our state, and assign this value to both
oldState and newState. We also set a variable called needWrite to
false; this is so we can avoid doing a CAS (they're not free) when
we have not actually updated the state.

2. Check if the cache is stale, and flush it if so. If the cache
gets flushed, set needWrite to true, as the state has changed.

3. Check if the methodCache has an entry for this dispatch
value. If so, we are "finished" in the sense that we found the
value we wanted. However, we may need to update the state. So,

If needWrite is false, we can return without a CAS, so just
break out of the loop and return the method.

Otherwise, we need to update the state object with a CAS. If
the CAS is successful, break out of the loop and return the
target function. Otherwise, continue on the next iteration
of the loop, skipping any other attempts to fetch the method
later in the loop (useless work, at this point).

4. The value was not in the methodCache, so call the function
findBestMethod() to attempt to find a suitable method based on the
hierarchy and preferences. If it does find us a suitable method,
we now need to cache it ourselves. We create a new state object
with the new method cache and attempt to update the state atom
with a CAS (we definitely need a write here, so no need to check
needWrite at this point).

The one thing that is possibly questionable is the check at this
point to make sure the hierarchy has not been updated since the
beginning of this method. I inserted this here to match the
identical check at the corresponding point in
findAndCacheBestMethod() in the original code. That is also a
second check, since the cache is originally checked for freshness
at the very beginning of getMethod() in the original code. That
initial check happens at the beginning of the loop in the
patch. Given that there is no synchronization with the method
hierarchy, it is not clear to me that this second check is needed,
since we are already proceeding with a snapshot from the beginning
of the loop. Nonetheless, it can't hurt as far as I can tell, it
is how the original code worked, and I assume there was some
reason for that, so I kept the second check.

5. Finally, if findBestMethod() failed to find us a method for the
dispatch value, find the method for the default dispatch value and
return that by breaking out of the loop.

So the organization of getMethod() in the patch is complicated by two
factors: (1) making the retry looping explicit and stackless, (2)
skipping the CAS when we don't need to update state, and (3) skipping
needless work later in the retry loop if we find a value but are
unable to succeed in our attempt to CAS. Invoking a multimethod that
has a stable hierarchy and a populated cache should not even have a
CAS operation (or memory allocation) on this code path, just a cache
lookup after the dispatch value is calculated.

I've updated this patch (removing the old version, which is entirely superseded by this one). The actual changes to MultiFn.java are identical (modulo any thing that came through in the rebase), but this newer patch has tests of basic multimethod usage, including defmulti, defmethod, remove-method, prefer-method and usage of these in a hierarchy that updates in a way interleaved with calls.

I created a really, really simple benchmark to make sure this was an improvement. The following tests were on a quad-core hyper-threaded 2012 MBP.

With two threads contending for a simple multimethod:
The lock-based multifns run at an average of 606ms, with about 12% user, 15% system CPU at around 150%.
The lockless multifns run at an average of 159ms, with about 25% user, 3% system CPU at around 195%.

With four threads contending for a simple multimethod:
The lock-based multifns run at an average of 1.2s, with about 12% user, 15% system, CPU at around 150%.
The lockless multifns run at an average of 219ms, with about 50% user, 4% system, CPU at around 330%.

It's been a couple of months, and so I just wanted to check in and see if there was anything else needed to move this along.

Also, Alan Malloy pointed out to me that my benchmarks above did not mention single-threaded performance. I actually wrote this into the tests above, but I neglected to report them at the time. Here are the results on the same machine as above (multithreaded versions are basically the same as the previous comment).

With a single thread performing the same work:
The lock-based multifns run at an average of 142ms.
The lockless multifns run at an average of 115ms.

So the lockless multimethods are still faster even in a single-threaded case, although the speedup is more modest compared to the speedups in the multithreaded cases above. This is not surprising, but it is good to know.

My only concern is that the extra allocations of the State class will create more garbage, but this is probably not significant if we are already using persistent maps. It would be interesting to compare this implementation with one using thread-safe mutable data structures (e.g. ConcurrentHashMap) for the cache.

I think your assessment that it's not significant compared to the current implementation using persistent maps is correct. Regarding the extra garbage, note that the new State is only created when the hierarchy has changed or there's a cache miss (related, obviously)... situations where you're already screwed. Then it won't have to happen again for the same method (until another change to the multimethod). So for most code, it won't happen very often.

ConcurrentHashMap might be faster, it'd be interesting to see. My instinct is to keep it as close to being "made out of Clojure" as possible. In fact, it's hard to see why this couldn't be rewritten in Clojure itself some day, as I believe Chas Emerick has suggested. Also, I would point out that two of the three maps are used from the Clojure side in core.clj. I assume they would be happier if they were persistent maps.

Funny story: I was going to point out the parts of the code that were called from the clojure side just now, and alarmingly cannot find two of the functions. I think I must have misplaced them while rewriting the state into an immutable object. Going to attach a new patch with the fix and some tests for it in a few minutes.

As promised, I reimplemented those two functions. I also added more multimethod tests to the test suite. The new tests should definitely prevent a similar mistake. While I was at it, I went through core.clj and found all the multimethod API functions I could and ensured that there were at least some basic functionality tests for all of them. The ones I found were: defmulti, defmethod, remove-all-methods, remove-method, prefer-method, methods, get-method, prefers (Some of those already had tests from the earlier version of the patch).

Really sorry about catching this right after you vetted the patch. 12771 test assertions were apparently not affected by prefers and methods ceasing to function, but now there are 12780 to help prevent a similar error. Since you just went through it, I'm leaving the older version of the patch up so you can easily see the difference to what I've added.

Patch clj-988-tests-only-patch-v1.txt dated Sep 15 2012 is a subset of David Santiago's
patch issue988-lockless-multifn+tests-120817.diff dated Aug 17 2012. It includes only the tests from that patch. Applies cleanly and passes tests with latest master after Rich's read/write lock change for multimethods was committed.

As described in the summary, calling (long ...) on a character throws a ClassCastException. I know the semantics surrounding numbers has changed a bit, but I think it would be nice for this to be consistent.

Same issue as CLJ-886, but while the patch applied for that issue fixes the problem for many OS/JDK combos, there appear to be differences in how surrogate pair characters are handled in some OS/JDK combos. In particular, at least Linux + IBM JDK 1.5 and Linux + IBM JDK 1.6 still fail the tests checked in for do-copy.

There are still failing tests for Linux + IBM JDK 1.6. This patch currently skips the failing tests whenever the java.vendor is "IBM Corporation" and java.version is "1.6.0", so that "ant" succeeds. It is easy enough to modify the patch so that the failing tests are kept as a bright shining reminder. Let me know if that would be preferred – it just involves removing the function ibm-jdk16, and simplifying where it is called after replacing it with false.

clj-967-disable-failing-io-copy-tests-on-ibm-jdk-16-patch1.txt dated May 19, 2012 disables tests of clojure.java.io/copy that fail with IBM JDK 1.6.0. It makes minor changes to the clojure.java.io code file, but these are only to eliminate uses of reflection, and to make a few of the versions of do-copy share more of their implementation code.

Tested that all results are good on Mac OS X 10.6.8 + Oracle/Apple JDK 1.6.0 and Ubuntu 11.10 + Oracle JDK 1.7.0, including that the number of tests run are identical to before this patch. Only 2 tests are disabled on IBM JDK 1.6.0, and all of the rest pass before and after these changes.

Hoping that I'm not out of line changing approval from Incomplete to None after attaching a patch that should address the reason it was marked incomplete. The only other way to get a ticket out of Incomplete state is for a core team member to do it for me, which seems like busy-work.

Any comments from Rich or other screeners on the status of this one? Just curious, since it seemed to have been screened, and then quietly tossed back into the unscreened pile. Is it considered undesirable to disable selected tests for a particular JDK as the patch clj-967-disable-failing-io-copy-tests-on-ibm-jdk-16-patch1.txt does? My motivation was to selectively disable only the tests that fail, and only on the JDK where they fail, leaving all passing tests to continue to run wherever possible.

In test-clojure/rt.clj, the test last-var-wins-for-core evaluates #'clojure.set/subset?. This fails unless clojure.set has been loaded. In the normal run of the test suite, this dependency is satisfied by test-clojure/core-set being loaded first.

clj-964-add-require-of-clojure-set-patch1.txt dated March 31, 2012 applies cleanly as of latest master. It simply adds a require of clojure.set to test_clojure/rt.clj. I verified that if that test was the only one, it does fail without the require, and passes with it. I also verified that every other test file succeeds on its own without any further changes.

One-line simple fix. clj-962-subvec-nth-throws-on-negative-index-patch1.txt dated March 29, 2012 applies, builds, and tests cleanly on latest master. Includes a few new tests that exhibit the problem. One author has signed CA.

clj-962-subvec-nth-throws-on-negative-index-patch2.txt dated May 10, 2012 is identical to previous patch clj-962-subvec-nth-throws-on-negative-index-patch1.txt dated Mar 29, 2012, except previous one failed to apply cleanly to latest master because of some lines of context changing in the source code.

In order to produce accurate source maps, I need column information in addition to line information from the Clojure reader.

I've made the necessary enhancement to LispReader, etc. but have some cleanup and testing left to do. I'd also like a sanity check from the core team before attaching a formal patch. You can find my work in progress here: https://github.com/brandonbloom/clojure/compare/columns

Yes, Brandon, your patch has been on the list of prescreened patches since April 15. It has always applied and built cleanly during that time. That patch label is nice to have for certain JIRA report filters, but isn't necessary for the prescreening process to pick it up.

v2 now applies against the current master (191b05f1). The original patch seemed to be broken from a whitespace perspective, which was making it difficult to apply--even in a 3-way merge. The only real conflict was in Compiler.java where a "final int line" was added to CompilerException. All the tests passed.

It looks like the line field was added to CompilerException in commit 89245c68, but that commit doesn't use it for anything. Maybe a later commit uses it? Also, if we want to keep that field, can we also add a column field for patch v3?

The tests are a little fragile (exact map comparison) and so will break if the reader ever does more things. In library and app projects I write tests like this using core.match, but that isn't available as a build dependency here.

bigdec handles java.math.BigInteger when converting to java.math.BigDecimal, but it does not handle clojure.lang.BigInt. Instead it treats a clojure.lang.BigInt as a Number, by casting it to long. This causes the following error:

Add a case to bigdec to handle BigInts. Also eliminate a reflection warning in the ratio case while we are in there. Paul's failing case has been added to tests, fails before the fix, and passes after. Attempted to make it as run-time efficient as possible by creating a new BigInt/toBigDecimal, patterned after the existing BigInt/toBigInteger.

I was originally thinking of something like (BigDecimal. (.toBigInteger ^clojure.lang.BigInt x)). Adding a toBigDecimal method to clojure.lang.BigInt saves some object allocations and such. Probably more of a micro optimization, but it works.

clj-948-annotate-gen-class-constructors-patch2.txt same as Kevin's CLJ-948.patch, but with slight update so it applies cleanly to latest master as of March 9, 2012. ant builds and tests with no errors or warnings. One author Kevin Downey has signed CA.

clj-948-annotate-gen-class-constructors-patch3.txt on Mar 26, 2012 is no different from previous patches, except that it applies cleanly to latest master as of that date. One author Kevin Downey has signed a CA. Kevin, you don't need to thank me again. The last time gave me eye damage (only joking).

When the first unicode code point of a string is supplementary (i.e. requires two 16-bit Java chars to represent in UTF-16), and that first code point is changed by converting it to upper case, clojure.string/capitalize gives the wrong answer.

If using UTF-16 to encode Unicode strings, and making every UTF-16 code unit (i.e. Java char) individually indexable as a separate entity in strings, is such a bad design choice that you consider it a bug, then yes, this is a Java bug (and a bug in all the other systems that use UTF-16 in this way).

clojure.string/capitalize isn't using some Java capitalization method that has a bug, though. By calling (.toUpperCase (subs s 0 1)) it is not giving enough information to .toUpperCase for any implementation, Java or otherwise, to do the job correctly. It is analogous to calling toupper on the least significant 4 bits of the ASCII encoding of a letter and expecting it to return the correct answer.

When requiring a namespace that doesn't compile, a namespace is still created. The attached patch removes the namespace on failure if the namespace wasn't already present on entry to load-lib. See the test case in the patch for repro instructions.

This is obviously a subset of having atomic loads. Would a step further in this direction, e.g. using a swapable state object within clojure.lang.Namespace be of interest?

This patch accomplishes its purpose, although (in the absence of fully atomic load) I would imagine it creates a race condition if one thread tried to load a broken namespace while another thread tried simply to create a namespace.

Hugo, your patch doesn't apply cleanly to latest master due to some changed lines of context around it that are from before Nov 2011, which confuses me given that your patch was created recently. I could mechanically update it, but if you could take a look and create an updated patch yourself it would be safer.

Applies cleanly against d4170e65d001c8c2976f1bd7159484056b9a9d6d. This looks good to me. We should at some point talk more about the implications of checking IPersistenList, but I think there is enough value here to push it forward.

add a heading such as "<h4>Binding Forms (Argument Destructuring)</h4>" to the section describing it

make all occurences of "binding-form" a link to that heading, especially under the let and fn headings

for (fn ..) add a new second paragraph like "Regarding binding forms, also known as argument destructuring, <a href=#thea-new-heading>read more in the binding forms section</a> under the let special form."

Why:
It was always surprisingly difficult for me to google out the explanation of destructuring in Clojure and only today have I discovered that it is described pretty well under the let special form. Thus it should be made more visible to the readers (and preferably also search engines). (Even though Google has returned the page as one of the first matches, I had problems seeing it there - partly due to usually mistyping destructuring e.g. as deconstruction).

Thanks a lot!

PS: I haven't found a better way to report this, such as a commenting capability on the page itself.

Jakub, I don't think I have authorization to edit those web pages in the way you suggest. I did want to comment that the most recent version of the Clojure cheatsheet at http://clojure.org/cheatsheet now has a link called "examples" in the "Special Forms" section that links to the 'let' section of the special forms documentation page.

clj-934-transient-disj-patch2.txt fixes a problem with my previous one where it had a warning during testing because of a poorly named test that conflicted with a name in clojure.core. This one is clean.

The doc string for contains? covers the vector and Java array case explicitly. I'm not saying that this behavior shouldn't change, but at least it is well documented what it currently does in these cases.

Agreed. I just want to make sure that we are still ok with this functionality given that things are changing. Are there others (Stuart) that want to chime in here and make the intentions clear? If this is good then I would consider this screened and ready.

With data reader support, it's impossible to write a program to read an arbitrary stream of Clojure forms. For example, the following code will fail with the current 1.4.0-beta1 tagged literal support:

#point [0 2]

It might be enough to require that the read side define a reader for point, but what if we do not wish to require that both sides of the conversation know about the #point tag beforehand? Using the identity function as a default allows the reader to handle the literal form as is even when it doesn't recognize a tag (or even care to translate it).

The change from the current behavior is that missing tagged literal parsers are no longer error conditions.

I'd like to preserve the unknown literal tag as well as the value. That would allow a program to recreate the printed representation. A map would work as the value. You'd get something like: {:unknown-literal point :value [0 2]}. If you needed to pass that on to some other process, you could easily write it in the original literal form. Perhaps the key for literal tag should be namespace qualified to avoid possible confusion with user data. Another benefit of returning a map for an unknown literal tag is that equality tests still seem reasonable: (not= #foo "abc" #bar "abc").

It would be convenient if I could handle unknown tags using some sort of catch-all key in *data-readers* (say 'default). The associated function should take two arguments: the tag and the literal value. If there is no 'default key in *data-readers*, then an error would be thrown (same as Clojure 1.4).

I think it's a simple way to allow the programmer to take control without having to add new API or data types. It's just one distinguished key ('default, :default something like that) and one additional line of doc.

This needs to be addressed. We should follow the advice given in the edn docs:

If a reader encounters a tag for which no handler is registered, the implementation can either report an error, call a designated 'unknown element' handler, or create a well-known generic representation that contains both the tag and the tagged element, as it sees fit. Note that the non-error strategies allow for readers which are capable of reading any and all edn, in spite of being unaware of the details of any extensions present.

We can get both of the latter by having a preinstalled default unknown element handler that returns the generic representation. "identity" is out since it loses the tag. Using a map with namespaced keys is somewhat subtle. An TaggedElement record type is also possible.

Issues are - what happens when more than one library tries to supply a default? If the system supplies one, perhaps it's best to only allow dynamic rebinding and not static installation. Or error on conflicting default handlers, or first/last one wins (but first/'last' might be difficult to predict).

Minimal patch that adds support for a default data reader in *data-readers*. If an unknown tag is read, we look up the 'default reader in *data-readers and call it with two arguments: the tag and the value. By default, there is no default reader and you get an exception as in Clojure 1.4.

An alternative patch that establishes a default data reader to handle the case of an unknown tag. The default reader returns a map with metadata to define the :unknown-tag. This preserves the information for the unknown data literal, but keeps a simple and convenient format.

I am guessing that you consider these two patches alternatives, not cumulative. I am marking as screened the "default-in-data-readers" patch, which allows users to specify a 'default handler.

The other patch "unknown-data-reader", which specifies a built-in Clojure handler for unknown tags, is not screened. It specifies a default handler that returns a metadata-annotated map. If there is to be a default handler, I think it would need to return something disjoint from data, e.g. a tagged interface of some kind (or at least a namespaced name in the map.)

It would be a great help to have a little discussion along with the patches.

Yes, the two patches are separate alternatives. The "default-in-data-readers" patch just adds the special treatment for the 'default key in *data-readers* without providing a system default. This allows the application programmer to provide a catch-all handler for unknown data literal tags. Libraries should be discouraged from setting a 'default handler. Conflicts with the 'default key in data_readers.clj are handled just like other keys so it would be a bad thing for multiple libraries to try to take over the 'default data reader. Libraries can instead provide implementations and let the application programmer do the actual setting of the 'default key.

The second patch ("unknown-data-reader") implements 'default similarly, but also provides for a 'default reader in default-data-readers. My unknown-data-reader returns a map. I found it safer and simpler to deal with keywords instead of symbols for unknown tag – Clojure doesn't like symbols for unknown packages and you never know what you might get in data literal tags. After experimenting with with my original idea of having a map with two keys (essentially :tag and :value), I decided that I preferred the single entry map with the key derived from the tag and the value preserved as read. Adding the metadata to define the :unknown-tag provides enough information for the application programmer to deal with the maps unambiguously. I think the single entry maps are easier to read.

The alternative that I did not pursue would be to use a new record type as the default data type. My second patch could be used as a basis for that approach. We just need to replace my unknown-data-reader function with one that creates a record (TaggedElement).

I don't like the 'default entry, especially if it is usually wrong for a library to set it.

Having no bindable var default makes editing data_readers.clj an outstanding chore for everyone.

It is unlikely a single special override in data_readers.clj is suitable for every reading situation, even in the same app.

If there is a known TaggedElement type, then people need only be able to opt out of that.
So if there were a default-tagged-reader function of (tag read-element) that built a TaggedElement (or, alternatively, threw, if people prefer), people could just rebind that in a particular reading context.

There's no perfect default, but in most situations getting unknown data is an error. I personally would default to that, and provide a read-tagged-element fn people could bind in when trying to implement read-anything.

old patches deleted. This revised patch introduces a var *default-data-reader-fn* which can be used to handle unknown tags. By default it is nil and an exception is thrown for the unknown tag as in Clojure 1.4. If it's non-nil, the function is called with the tag and value. I chose the name so that it contained 'data-reader', which makes it search friendly. I wanted to commit this separately from any attempt to provide a built-in catch-all reader and new record type as that might be more contentious.

The patch for *default-data-reader-fn* was committed for Clojure 1.5 beta1. The programmer can now specify a catch-all reader, which solves the main issue. However, there is no built-in default reader so unknown tags still cause exceptions to be thrown as in Clojure 1.4.

I think this bug can be closed. Default reader implementations could be provided by a contrib library. Or someone can open a new bug if you want the language to provide a built-in default reader.

In general Clojure's number types can be read prefixed with either a +
or - and this seems to work correctly for reading integers and floats.
In the case of ratios however things break down when ratios are
prefixed with a +.

The ratio pattern in LispReader.java does match on ratios starting
with both + and - but matchNumber fails on ratios prefixed with +
because it ends up calling "new BigInteger(m.group(1))" and it turns
out the constructor for BigInteger has no problems with negative
numbers but it doesn't like numbers prefixed by a +.

clj-923-reading-ratios-prefixed-by-plus-patch2.txt still semantically same as Cosmin's original patch, except it applies, builds, and tests cleanly on latest master as of Mar 23, 2012. Context lines around patch must have changed recently.

I'm seeing some team members (myself included) rely on metadata. Metadata has been incredibly useful in some cases, however the silent destruction of metadata by core clojure fns (into, walk, etc) have been a source of increasing complexity and confusion.

A team member could start relying on a 3rd party library that used 'into'. Actually, the into fn could essentially be added to the code base at any time in many ways. Wrt metadata the potential for incidental complexity increases exponentially as more developers and libraries enter the mix.

One of the reasons Clojure has worked for us so well is because it has greatly reduced incidental complexity.
IMHO the 'into metadata bug' seriously limits the usefulness of metadata and would love to see the into fn fixed in 1.4.

Someone please correct this if it is wrong, but it seems that into loses metadata because it is implemented using transients, and transients lose metadata. If true, then one way to fix this ticket is to implement metadata for transients.

Are there any fundamental reasons why it would be difficult to add metadata to transients, as long as the metadata itself was immutable?

First rough cut of a patch that makes not only into, but also select-keys, clojure.set/project, and clojure.set/rename preserve metadata. Added tests for these and many other functions. Still some open questions about other functions not tested in comments. This patch does not attempt to make transients preserve metadata.

You could examine the existing patch, including its tests, and see if it would have done what you were hoping it would do. Add a comment here regarding whether the changes look good to you. The attached patch is already on my weekly list of prescreened patches, but it is only one among many.

clj-916-make-into-preserve-metadata-patch1.txt dated Aug 15 2012 only changes the behavior of into so that it preserves metadata. One or more other tickets will be created for some other functions that currently do not preserve metadata, but perhaps should.

The attached patch copies metadata given to the method name symbol of memfn to the method receiver in the expansion. That way, memfn is able to be used even for type-hinted calls resulting in a big performance win.

In some situations, it's necessary to configure the buffer size of LineNumberingPushbackReader's wrapped java.io.LineNumberReader, that gets created in the constructor. A concrete problem case is where you want to avoid doing reads from the underlying Reader whenever possible, so using a buffer size of 1 makes it a bit lazier. I can also imagine cases where you'd want to increase the buffer from java.io.BufferedReader's 8192-char default, but I haven't dealt with that one directly.

There's no good way to do this by subclassing LineNumberingPushbackReader, since all the action happens in the constructors: those of java.io.LineNumberReader and the superclass of LineNumberingPushbackReader, which is java.io.PushbackReader. So my current workaround is to copy the entirety of LineNumberingPushbackReader, change the name, and add a constructor. Having LineNumberingPushbackReader support this directly would be great.

Both java.io.LineNumberReader and java.io.PushbackReader have constructors that accept the buffer size as the second argument, so it seems very reasonable to me to add a similar constructor for LineNumberingPushbackReader.

Turns out my current workaround to get around the lack of this ability has a problem: LispReader depends on the concrete LineNumberingPushbackReader class to be able to call .getLineNumber (via instanceof / casting). So the similar (read: nearly-copied) class I'm trying to use can't store line numbers, which makes the stack trace less nice (fwiw, this is in REPL-y: https://github.com/trptcolin/reply).

It would be nice to have an interface (ILineNumberingPushbackReader?) that declares getLineNumber, readLine, and atLineStart, and have things check instanceof on that instead of the concrete class. That way, anyone else adhering to the LineNumberingPushbackReader contract (in order to bind that to *in* as the docstring for `clojure.main/repl` prescribes) can do it and have line numbering play nicely with the Clojure reader.

If that sounds desirable, I can replace this patch. The existing patch will also work fine if we want to keep things concrete, or if that feels enough like solving a different problem to require another ticket.

Regarding the larger question of an interface I'd say another ticket is in order. The existence of LNPBR is an implementation detail, but I too have found it useful. Maybe there is a wider discussion on the usefulness of LNPBR and a sane way to expose it for tool, compiler, etc. devs?

It appears it is not safe to call resolve on a symbol representing a namespace; you get the error above. FWIW, I seem to have resolved the problem (see attached diff) by moving the find-ns clause above the resolve clause (in the cond); also the reference to namespace-doc needs to be var-quoted since namespace-doc is private.

While a relaxation of when different changes to the environment are made may impact the resulting value of the vector, as it is with lazy seqs, it oughtn't be possible to get two different results for the same access, just as with lazy seqs.

I think the docstring in the 24/Feb/12 patch is too long. Also not entirely correct: calling vec on a mutable java.util.List, for example, will create an immutable vector from the contents of the List.

Instead just say that Java arrays may be aliased by vec and should not be modified.

clj-892-clarify-sort-sort-by-doc-strings-patch1.txt of Mar 26, 2012 applies cleanly to latest master on that date. Only doc string changes, to make it clear that by sort and sort-by will modify Java arrays given as arguments.

I've modified Vyacheslav's patch so that it is in the proper format. I also changed his implementation of function inc-s so that it should allocate a bit less memory, and removed an addition of a redundant test case that is in his patch. There is a bug he has found here, and I've verified that this patch fixes it.

clj-881-cl-format-exception-patch2.txt Mar 26, 2012 applies cleanly to latest master, and fixes the problem in the same way as my Feb 12, 2012 patch (since deleted to avoid confusion). I am the author, and have signed a CA.

clojure.string/replace uses javas replace function to do it's work, the replace function has the tricky habit of 'double evaluating' the replacement string (third argument). This means that every \ in the string (so i.e. "") is evaluated by the java code behind replace.

Since this behavior isn't documented it can lead to confusing errors, for example (made up semi real world example to explain the issue):

(clojure.string/replace "c:/windows/" #"/" "\\")

This should replace all unix folder separators with windows separators, this crashes with a index out of bound exemption since java tries to evaluate "" as a semi regexp.

No hand-escaping is necessary, just use #(java.util.regex.Matcher/quoteReplacement %).

Edit: I see, we're talking about when you use a function as a replacement, not a replacement string like "x" - the function's output should not be interpreted as regex replacement, I agree. Looks like it just needs a small patch in clojure.string/replace-by.

CLJ-870-also-fix-replace-first.patch combines Alan Malloy's two patches, and adds the same change for replace-first. I like this change, too, and think replace and replace-first should be consistent with each other in this behavior.

Attaching slightly improved patch CLJ-753-870-905-combined-fix3.patch, along with a README to explain in gory detail the behavior and performance changes to clojure.string/replace and clojure.string/replace-first, in case it would help anyone review the changes. Deleting my older patches.

Added patch addressing what I think is Aaron's reason for changing state to Incomplete, so as a result I am changing state back to its former Vetted state. If there is something else a non-screener should be doing to bring Incomplete tickets to the attention of screeners, please let me know, and I will document it here: http://dev.clojure.org/display/design/JIRA+workflow

Records which contain the same field values are not considered to be equal, but the record type is not currently included in the hash. This means that if an heterogenous map contains records of different types, where the fields aren't necessarily distinct, there is a high likelyhood of hash collisions.

I propose that the record type should be included in the hash code algorithm for records. There should be no expectation that unequal records of different types should have the same hash-code.

It works, but it's inconvenient and adds complexity to the handling of bean maps.

Normal maps of course already return nil as the default value for a missing key. The current bean proxy does the contains? check on the key only if there's a default value. The patch uses the same contains? check for the single-arg version of valAt.

clojure.core/bases returns a clojure.lang.Cons when called with a class as parameter, but a Java array ( [Ljava.lang.Class; ) when called with an interface. Both should return values of the same type, which I guess should be a seq.

I have attached a patch which calls the seq function on the else part of clojure.core/bases. Also updated the clojure.test-clojure.java-interop/test-bases test. However these test do not currently seem to run as part of the maven build. I have however run the test manually and verified that it works.

Behavior where bases returns Java array in some cases still exists on latest master, and this patch still applies cleanly and changes that behavior as described.

The patch writer Alf is not on the list of people who have signed a CA. If he isn't in the process of doing so, what should be done with the patch? If I described the behavior to a committer without showing them the patch, they would likely come up with a similar one-line fix.

I was not aware that contributors had to sign the CA to get a patch in when I wrote this, so my apologies for adding the patch. I just wanted to make the contribution an "easy add". I guess you guys would have figured out a similar fix really easy.

I am not in the process of signing the CA. I would not have any problems signing it, but then I guess for such a small contribution it would be too much of a hassle. And besides, I don't think this small contribution makes me a worthy contributor.

I would certainly not mind you guys fixing this in anyway you can, with or without my patch.

Independently written patch that fixes bases so that it always returns a seq. Added additional tests to verify that it returns nil (rather than an empty seq) when there are no bases to maintain compatibility with the original code.

"At times it is necessary to have a value of a particular primitive type. These coercion functions yield a value of the indicated type as long as such a coercion is possible: bigdec bigint boolean byte char double float int long num short"

(int) (char) etc. are not intended to return boxed types, they always return primitive types. If you want boxed types you can use (num). When you do (class (int 42)) you are casting the long 42 to a int and then boxing it into a Long to get its class.

For short and byte, I don't know, I think there is a bug, (class (short 42)) should be Short.

Thank you, Sebastián, that made it clear and unveiled the problem at my side. When I did

(set-value! foo :position (int 17))

I got the exception "java.lang.Long cannot be cast to java.lang.Integer". set-value! is a protocol function extended on the java types of the java library I'm interacting with. Eventually, it'll call (doto foo (.setAttribute "position" (mutable <providedVal>)), where mutable is yet another protocol function that converts clojure collections to the libraries counterparts and does nothing for primitive types.

Now, although (int 17) returns an unboxed int, these two indirections box it into a Long again and the java setter is called with that leading to the error.

(set-value! foo :position (Integer/intValue 17))

circumvents the issue, because here you directly get an int boxed as Integer.

If you want boxed types you can use (num).

Well, I want a boxed value of a certain type, like Integer.

For short and byte, I don't know, I think there is a bug, (class (short 42)) should be Short.

Should or should not? Currently, for all primitive types <primType>, the call (class (<primType> <val>)) returns the wrapper class corresponding to <primType> with the exception of int where you get a Long instead of an Integer. That's totally surprising and caused me to file this bug report.

clj-788-add-line-member-and-getter-to-CompilerException-patch.txt is identical to Hugo's, but for some reason that isn't obvious to me his did not apply cleanly to latest master when I tried it out, but this one does.

Brandon, just a warning that while anyone who looks at this ticket will see your comment, not many people examine closed tickets looking for things to do with them. If by your comment you mean that you think some additional change should be made to Clojure that wasn't made with the patch on this ticket (which I think was committed), nor by the patch committed for CLJ-960, bringing it up in the clojure-dev group, or opening another ticket, might be more effective.

Attached patch assumes that CLJ-881 patch has been applied. It might apply cleanly without CLJ-881 patch being applied first – I haven't checked. Fixes the issue originally reported, and several other problems found with cl-format's ~f directive.

Thanks for the review, Tom. If you have the time, CLJ-881 is shorter, and fixes a different bug. I do know that if you apply the CLJ-881 recommended patch, and then this one, everything works fine. If it is helpful, I can try out this one independently, but they are both bugs in the ~f format of cl-format, the other one causing Clojure to throw an exception for some combinations of ~f directives and numerical values.

clj-768-without-clj-881-patched-first.txt is the same as the one that Tom reviewed, except it applies cleanly to master as of Feb 23, 2012 without having to first apply the patch for CLJ-881. This bug and the patch are really independent of CLJ-881.

clj-768-patch-for-after-clj-881-fixed-patch2.txt dated May 18, 2012 is the most up to date one as of the latest updates to Clojure master. Only context line changes from previous patches so that it applies cleanly, not substantive changes.

clj-763-destructuring-allow-dup-keys-patch2.txt dated Mar 26, 2012 applies cleanly to latest master and passes all tests, including a new one added based upon Meikel's report that fails without his fix. Supersedes his earlier patch 0001-Avoid-duplicate-key-check-in-destructuring.patch of Mar 21, 2011. Meikel and I have both signed CAs.

Uploaded 0001-Do-not-check-for-duplicates-in-destructuring-map-cre.patch with a faster implementation accessing the unchecked constructor directly plus a simplified test case. Supersedes the two previous patches. (I can delete only mine, but not Andy's.)

clj-757-fix-behavior-of-empty-transient-maps-patch2.txt is an update of Alan's original patch, with no changes other than those required to apply cleanly to latest master as of Feb 22, 2012. Compiles fine and runs without test errors, and does fix the behavior of the one new unit test included.

Approval was changed from Test to Incomplete when it was waiting on Rich's feedback. He did give his feedback, but Approval never changed back to Test. Doing that now in hopes it makes the ticket more accurately categorized by filters. Feel free to give me a stern warning for mucking with the Approval field if I'm out of line here.

clj-757-fix-behavior-of-empty-transient-maps-patch3.txt dated June 18, 2012 is identical to clj-757-fix-behavior-of-empty-transient-maps-patch2.txt, except it has some updated context lines so that it applies cleanly with latest Clojure master. I will remove the -patch2.txt patch since it no longer applies cleanly.

The patch in clj-757-fix-behavior-of-empty-transient-maps-patch3.txt is straight-forward and so is its test. This fixes only the behavior listed in the ticket and does not address the the general transient lookup behavior.

Currently, there is no way in Clojure to call a protected final method of a superclass. While this may be an acceptable limitation for proxy, gen-class should provide that ability. Otherwise, one is now forced to create a dummy subclass in Java for the sole purpose of widening the visibility of such methods.

My patch makes it so that protected final methods may be exposed via the :exposes-methods mechanism. It is still impossible to override them.

The tests in other_functions.clj for the combinator functions are tenuous in some cases (comp) and missing in others (juxt, partial, complement, and constantly). It would be nice to have a full suite of combinator tests moving forward.

I cannot vouch for the changes in Compiler.java – I simply modified what was in Juha Arpiainen's patch so that they applied cleanly. Some of the context changed noticeably due to a patch for CLJ-31 by Kevin Downey that was applied.

The test parts of the patch I can vouch for. Some of the previously existing tests seem not to test what they should. I can make a separate patch with just the updated passing tests if that is desired.

Somehow I just forgot to define parameters for my zero argument function and it took me a long time to notice it, mostly because I had been working with a lot of macros but especially because I didn't get an error that said "You must provide a list of parameter names", but instead: "Don't know how to create ISeq from: Symbol".

(defn add (a b) (+ a b))
IllegalArgumentException Parameter declaration a should be a vector clojure.core/assert-valid-fdecl

Here the error message is based on a guess that I was trying to use the multi-arity syntax, but I was probably using the single-arity syntax with the wrong brackets. You might split these cases out by counting the top-level forms first.

I think I am cool with the change to (fn) and (fn a), but I can imagine a code generator relying on this behavior as a sentinel to avoid explicit coding for a corner case. What do others think?

Stuart's example is an interesting one - that could have meant at least one other thing in addition to multi-arity syntax and single-arity syntax with accidentally-list params: single-arity syntax with missing param list, and two successive body forms (where a and b are free variables). I think the error message here would still be an improvement over "Don't know how to create ISeq from: Symbol", but maybe it tries to be too specific in talking about "a". I'd tend toward being more generic and just saying "Parameter declaration should be a vector", which I think would be helpful and correct in all the intended cases we've imagined for this particular example.

I think breaking (fn) / (fn a) makes sense - if there's a valid use for those I'd be interested in seeing it. Relying on ArityException in code generation seems fairly error-prone to me, but I could easily be missing something here.

To the boolean point, I tend to agree with Stuart and would prefer reversing the if/else clauses to avoid it:

clj-157-better-err-msgs-for-defn-fn-syntax-errors-patch2.txt has no substantive changes from Ambrose's two patches. It does combine them into one commit, and unlike the older ones applies cleanly to latest master as of March 7, 2012.

Reported by rosejn, Mar 19, 2009
Example:
user=> (if-let [foo (+ 1 2)] foo + 1 foo)
java.lang.IllegalArgumentException: if-let requires a vector for its
binding (NO_SOURCE_FILE:2)
The error reports that if-let requires a vector for its binding, but the
problem here has nothing to do with the binding being a vector. A paren
was forgotten before the '+', so the number of arguments is incorrect for
the if form.

I have a patch that gives a better error message. It will now say: "IllegalArgumentException if-let requires 1 or 2 forms to be passed". I'm not sure if that's a better error message or not, but I'm happy to tweak it.

clj-103-incorrect-if-let-error-patch2.txt is same as John's except updated to apply cleanly to latest master as of Feb 23, 2012. No compiler errors or warnings, no test failures. No new tests for this one – I could write one, but the only way to do so is to check the contents of the error message in the exception thrown, which seems a bit too specific. The author John Szakmeister is on the contributor list.