Proposal: allow disabling static initialisers for imported classes

In the main Clojure mailing list there's a long discussion (https://groups.google.com/d/topic/clojure/tWSEsOk_pM4/discussion) about a very frustrating problem. The issue is that when classes are imported in Clojure, the class is loaded using Class.forName(), which causes its static initialisers to be executed. This differs from Java where compilation does not cause classes to be loaded.

In many cases when those classes are normally loaded by Java code during execution of a framework of some kind (IntelliJ in my case, and RoboVM is another culprit mentioned in that thread) the initialiser expects some infrastructure to be in place and will fail when it's not. This means that it's impossible to AOT compile namespaces importing these classes, which is a fairly serious limitation.

Aaron Cohen suggested a fix which I have tested and works for my case, at least. The fix is to modify ImportExpr to call RT.classForNameNonLoading() instead of Class.forName(), which will load the class but not initialise it. I actually had no idea this was even possible, so I'm not sure what the wider implications of it are. This change causes the Clojure test suite to fail, since clojure.test-clojure.genclass imports a gen-class'ed class which no longer loads its namespace on initialisation. I'm not sure if this is considered an incorrect use of such a class (IIRC with records it's required to import the class and require its namespace), but given that it's in the Clojure test case it's reasonable to assume that this fix would be a fairly breaking change for more code out there.

However this change is very useful in compilation scenarios. Unless we're willing to mandate import and require for gen-class'ed classes and investigate what other side effects might occur, I suggest a configuration flag enabling this change when Clojure is used only for compilation and not for execution (i.e. in AOT scenarios).

If everyone likes the idea I'm happy to make the fix and file a JIRA with the patch.

Referencing an imported class during compilation shouldn't cause it to be fully initialised. If anyone is relying on this behaviour for the side effects of initialisation, I'd argue that is a bug in their code for depending on undocumented implementation details. (note: the docstring for import certainly doesn't promise any such behaviour)

There may also be a start-up time performance advantage in fixing this: We aren't being as lazy as we could be if all initialisation code is getting run immediately. This is particularly painful for stuff like GUI / database libraries which sometimes do a lot of work during static class initialisation (loading/querying system configuration, allocating working storage space, pre-computing lookup tables, building index data structures etc.). It is much better to defer this work until it is actually needed (or explicitly requested with Class/forName).

I think I'm leaning towards the breaking test depending on a few implementation details that we don't necessarily want to guarantee.

1) It expects that loading the Class results in loading the namespace (which isn't promised by gen-class, it's just the default)

2) It expects that importing the Class results in loading the class

3) It expects that loading the Class results in adding some default vars to the namespace (-toString for instance)

I've attached a patch that makes the proposed fix, and corrects the tests for the change.

Should we make a JIRA issue for this?

If this is added, I'd suggest the following for release notes:

"import" no longer causes the imported class to be initialized. This change better matches Java's import behavior and allows the importing of classes that do significant work at initialization time which may fail.

This semantics change is not expected to effect most code, but certain code may have depended on behavior that is no longer true.

1) importing a Class defined via gen-class no longer causes its defining namespace to be loaded, loading is now deferred until first reference. If immediate loading of the namespace is needed, "require" it directly.

2) Some code may have depended on import to initialize the class before it was used. It may now be necessary to manually call (Class/forName "org.example.Class") when initialization is needed. In most cases, this should not be necessary because the Class will be initialized automatically before first use.

For Counterclockwise, I am also confronted to problems related to early initialization of Eclipse classes when namespaces are loaded.

I can reproduce a case where this happens, and try the mentioned fix to see whether it is doing more good than harm.

What could go wrong is that, for AOT classes, if the loading of the namespace doesn't happen at the same time as the :import, then when it is lazily loaded, wrong classloader may be used, resulting in Clojure classes not being visible, etc.

I've run into this with a couple of Java classes I've imported
causing, for example, logging to be initialized during compilation(!)
which can make it impossible to actually set the correct logging
environment. If you do a lot of Java interop and you use libraries
that assume static initialization is a "safe" approach, you can run
into real trouble in Clojure at times :(

I didn't realize it could actually be fixed so I never bothered
bringing it up, I've just worked around it as best I can. I'd _love_
to see this fixed!

Status Update: the patch learned me that it's not sufficient for Counterclockwise, due to the fact that:

- I create constant maps in vars, referencing constants in Eclipse code, which cause the classes to be initialized anyway (if it's not at line 1 in the ns clause, it's e.g. at line 66 in the (def my-constants {:clojure-friendly-key ISomeEclipseClass/SomeStaticField ...})

- OSGi classloaders honor the "lazy loading" of some Eclipse plugins/bundles that say that before being able to get the Class for one of their class, they must have been "started" (meaning their "Plugin" class has been loaded, initialized, and even their start() method called).

So at least as far as I'm concerned, this patch could probably mitigate some pain, but not in a generic way :-(

Yeah, I had some constants similar to your first case which I had to fix as well (my Spacing problem from the original thread). I fixed that by just having the constants in a Java class, which is a little ugly but works. There's really no good workaround that I'm aware of for this in Clojure either unfortunately.

I've just encountered another case of this, which was not due to an import. I'm not sure exactly of the circumstances, but in this case what fixed it was removing a tag from a function defined at the top level with defn:(defn ^HighlightInfo make-highlight [key data]

... etc etc

In this case, the compilation fails with an ExceptionInInitializerError. If I remove the ^HighlightInfo and instead tag the function invocations, everything works fine.

I haven't investigated this in great detail yet, but it seems like this is another case that would be worth fixing. Looking at the Compiler source, there are still some Class.forNames in there - I suspect that this one is in HostExpr.maybeClass, but I haven't confirmed this yet. Could all class loads in the compiler be replaced with non-initialising loads? I must admit I'm not 100% sure of the semantics of this, I'm assuming that a class that is loaded without initialisation will have its initialiser run if the JVM determines that it's necessary, but I don't know when this would occur. I'm not sure what the potential impact of loading all classes without initialisation would be - any insight would be very welcome. I'm happy to update Aaron's patch if this change seems worth it.

I've just encountered another case of this, which was not due to an import. I'm not sure exactly of the circumstances, but in this case what fixed it was removing a tag from a function defined at the top level with defn:(defn ^HighlightInfo make-highlight [key data]

... etc etc

In this case, the compilation fails with an ExceptionInInitializerError. If I remove the ^HighlightInfo and instead tag the function invocations, everything works fine.

Same here with Counterclockwise. I had to remove some type hintings here, or inside protocol definitions, to get rid of ExceptionInInitializerErrors.