Clojure JIRAhttp://dev.clojure.org/jira
This file is an XML representation of an issueen-us4.464925-07-2011[CLJ-124] GC Issue 120: Determine mechanism for controlling automatic shutdown of Agents, with a default policy and mechanism for changing that policy as neededhttp://dev.clojure.org/jira/browse/CLJ-124
Clojure<p>The original description when this ticket was vetted is below, starting with "Reported by cemer...@snowtide.com, June 01, 2009". This prefix attempts to summarize the issue and discussion.</p>
<p><b>Description</b>:</p>
<p>Several Clojure functions involving agents and futures, such as future, pmap, clojure.java.shell/sh, and a few others, create non-daemon threads in the JVM in an ExecutorService called soloExecutor created via Executors#newCachedThreadPool. The javadocs for this method here <a href="http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Executors.html#newCachedThreadPool%28%29">http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Executors.html#newCachedThreadPool%28%29</a> say "Threads that have not been used for sixty seconds are terminated and removed from the cache." This causes a 60-second wait after a Clojure program is done before the JVM process exits. Questions about this confusing behavior come up a couple of times per year on the Clojure Google group. Search for "shutdown-agents" to find most of these occurrences, since calling (shutdown-agents) at the end of one's program typically eliminates this 60-second wait.</p>
<p><b>Example</b>:</p>
<div class="code panel" style="border-width: 1px;"><div class="codeContent panelContent">
<pre class="code-java">% java -cp clojure.jar clojure.main -e <span class="code-quote">"(println 1)"</span>
1
[ <span class="code-keyword">this</span> <span class="code-keyword">case</span> exits quickly ]
% java -cp clojure.jar clojure.main -e <span class="code-quote">"(println @(<span class="code-keyword">future</span> 1))"</span>
1
[ 60-second pause before process exits, at least on many platforms and JVMs ]</pre>
</div></div>
<p><b>Summary of comments before July 2014</b>:</p>
<p>Most of the comments on this ticket on or before August 23, 2010 were likely spread out in time before being imported from the older ticket tracking system into JIRA. Most of them refer to an older suggested patch that is not in JIRA, and compilation problems it had with JDK 1.5, which is no longer supported by Clojure 1.6.0. I think these comments can be safely ignored now.</p>
<p>Alex Miller blogged about this and related issues here: <a href="http://tech.puredanger.com/2010/06/08/clojure-agent-thread-pools/">http://tech.puredanger.com/2010/06/08/clojure-agent-thread-pools/</a></p>
<p>Since then, two of the suggestions Alex raised have been addressed. One by <a href="http://dev.clojure.org/jira/browse/CLJ-378" title="Set thread names on agent thread pools"><del>CLJ-378</del></a> and one by the addition of set-agent-send-executor! and similar functions to Clojure 1.5.0: <a href="https://github.com/clojure/clojure/blob/master/changes.md#23-clojurecoreset-agent-send-executor-set-agent-send-off-executor-and-send-via">https://github.com/clojure/clojure/blob/master/changes.md#23-clojurecoreset-agent-send-executor-set-agent-send-off-executor-and-send-via</a></p>
<p>One remaining issue is the topic of this ticket, which is how best to avoid this 60-second pause.</p>
<p><b>Approach #1: automatically shut down agents</b></p>
<p>One method is mentioned in Chas Emerick's original description below, suggested by Rich Hickey, but perhaps long enough ago he may no longer endorse it: Create a Var &#42;auto-shutdown-agents&#42; that when true (the default value), clojure.lang.Agent shutdown() is called after the clojure.main entry point. This removes the surprising wait for common methods of starting Clojure, while allowing expert users to change that value to false if desired.</p>
<p><b>Approach #2: create daemon threads by default</b></p>
<p>Another method mentioned by several people in the comments is to change the threads created in agent thread pools to daemon threads by default, and perhaps to deprecate shutdown-agents or modify it to be less dangerous. That approach is discussed a bit more in Alex's blog post linked above, and in a comment from Alexander Taggart on July 11, 2011 below.</p>
<p><b>Approach #3:</b></p>
<p>The only other comment before 2014 that is not elaborated in this summary is shoover's suggestion: There are already well-defined and intuitive ways to block on agents and futures. Why not deprecate shutdown-agents and force users to call await and deref if they really want to block? In the pmap situation one would have to evaluate the pmap form.</p>
<p><b>Approach #4: Create a cached thread pool with a timeout much lower than 60 seconds</b></p>
<p>This could be done by using one of the ThreadPoolExecutor constructors with a keepAliveTime parameter of the desired time.</p>
<p><b>Patch</b>: clj-124-v1.patch clj-124-daemonthreads-v1.patch</p>
<p>At most one of these patches should be considered, depending upon the desired approach to take.</p>
<p>Patch clj-124-v1.patch implements appproach #1 using &#42;auto-shutdown-agents&#42;. See the Jul 31 2014 comment when this patch was added for some additional details.</p>
<p>Patch clj-124-daemonthreads-v1.patch implements approach #2 and is straightforward.</p>
<div class="code panel" style="border-width: 1px;"><div class="codeContent panelContent">
<pre class="code-java">Reported by cemer...@snowtide.com, Jun 01, 2009
There has been intermittent chatter over the past months from a couple of
people on the group (e.g.
http:<span class="code-comment">//groups.google.com/group/clojure/browse_thread/thread/409054e3542adc1f)
</span>and in #clojure about some clojure scripts hanging, either <span class="code-keyword">for</span> a constant
time (usually reported as a minute or so with no CPU util) or seemingly
forever (or until someone kills the process).
I just hit a similar situation in our compilation process, which invokes
clojure.lang.Compile from ant. The build process <span class="code-keyword">for</span> <span class="code-keyword">this</span> particular
project had taken 15 second or so, but after adding a couple of pmap calls,
that build time jumped to ~1:15, with roughly zero CPU utilization over the
course of that last minute.
Adding a call to Agent.shutdown() in the <span class="code-keyword">finally</span> block in
clojure.lang.Compile/main resolved the problem; a patch including <span class="code-keyword">this</span>
change is attached. I wouldn't suspect anyone would have any issues with
such a change.
-----
In general, it doesn't seem like everyone should keep tripping over <span class="code-keyword">this</span>
problem in different directions. It's a very difficult thing to debug <span class="code-keyword">if</span>
you're not attuned to how clojure's concurrency primitives work under the
hood, and I would bet that newer users would be particularly affected.
After discussion in #clojure, rhickey suggested adding a
*auto-shutdown-agents* <span class="code-keyword">var</span>, which:
- <span class="code-keyword">if</span> <span class="code-keyword">true</span> when exiting one of the main entry points (clojure.main, or the
legacy script/repl entry points), Agent.shutdown() would be called,
allowing <span class="code-keyword">for</span> the clean exit of the application
- would be bound by <span class="code-keyword">default</span> to <span class="code-keyword">true</span>
- could be easily set to <span class="code-keyword">false</span> <span class="code-keyword">for</span> anyone with an advanced use-<span class="code-keyword">case</span> that
requires agents to remain active after the main thread of the application
exits.
This would obviously not help anyone initializing clojure from a different
entry point, but <span class="code-keyword">this</span> may represent the best compromise between
least-surprise and maximal functionality <span class="code-keyword">for</span> advanced users.
------
In addition to the above, it perhaps might be worthwhile to change the
keepalive values used to create the Threadpools used by c.l.Actor's
Executors. Currently, Actor uses a <span class="code-keyword">default</span> thread pool executor, which
results in a 60s keepalive. Lowering <span class="code-keyword">this</span> to something much smaller (1s?
5s?) would additionally minimize the impact of Agent's threadpools on Java
applications that embed clojure directly (and would therefore not benefit
from *auto-shutdown-agents* as currently conceived, leading to puzzling
'hanging' behaviour). I'm not in a position to determine what impact <span class="code-keyword">this</span>
would have on performance due to thread churn, but it would at least
minimize what would be perceived as undesirable behaviour by users that are
less familiar with the implementation details of Agent and code that
depends on it.
Comment 1 by cemer...@snowtide.com, Jun 01, 2009
Just FYI, I'd be happy to provide patches <span class="code-keyword">for</span> either of the suggestions mentioned
above...</pre>
</div></div>CLJ-124GC Issue 120: Determine mechanism for controlling automatic shutdown of Agents, with a default policy and mechanism for changing that policy as neededEnhancementMinorOpenUnresolvedAlex MillerChas EmerickagentsWed, 17 Jun 2009 23:24:00 -0500Sat, 23 Aug 2014 11:52:33 -0500Release 1.5Release 1.686<p>Converted from <a href="http://www.assembla.com/spaces/clojure/tickets/124">http://www.assembla.com/spaces/clojure/tickets/124</a><br/>
Attachments:<br/>
compile-agent-shutdown.patch - <a href="https://www.assembla.com/spaces/clojure/documents/a56S2ow4ur3O2PeJe5afGb/download/a56S2ow4ur3O2PeJe5afGb">https://www.assembla.com/spaces/clojure/documents/a56S2ow4ur3O2PeJe5afGb/download/a56S2ow4ur3O2PeJe5afGb</a><br/>
124-compilation.diff - <a href="https://www.assembla.com/spaces/clojure/documents/aqn0IGxZSr3RUGeJe5aVNr/download/aqn0IGxZSr3RUGeJe5aVNr">https://www.assembla.com/spaces/clojure/documents/aqn0IGxZSr3RUGeJe5aVNr/download/aqn0IGxZSr3RUGeJe5aVNr</a></p><p>oranenj said: [<a href="file:a56S2ow4ur3O2PeJe5afGb">file:a56S2ow4ur3O2PeJe5afGb</a>]</p><p>richhickey said: Updating tickets (#8, #19, #30, #31, #126, #17, #42, #47, #50, #61, #64, #69, #71, #77, #79, #84, #87, #89, #96, #99, #103, #107, #112, #113, #114, #115, #118, #119, #121, #122, #124)</p><p>cemerick said: (In [<span class="error">&#91;r:fa3d24973fc415b35ae6ec8d84b61ace76bd4133&#93;</span>]) Add a call to Agent.shutdown() at the end of clojure.lang.Compile/main Refs #124</p>
<p>Signed-off-by: Chouser &lt;chouser@n01se.net&gt;</p>
<p>Branch: master</p><p>chouser@n01se.net said: I'm closing this ticket to because the attached patch solves a specific problem. I agree that the idea of an <b>auto-shutdown-agents</b> var sounds like a positive compromise. If Rich wants a ticket to track that issue, I think it'd be best to open a new ticket (and perhaps mention this one there) rather than use this ticket to track further changes.</p><p>scgilardi said: With both Java 5 and Java 6 on Mac OS X 10.5 Leopard I'm getting an error when compiling with this change present.</p>
<p>Java 1.5.0_19<br/>
Java 1.6.0_13</p>
<p>For example, when building clojure using "ant" from within my clone of the clojure repo:</p>
<p><span class="error">&#91;java&#93;</span> java.security.AccessControlException: access denied (java.lang.RuntimePermission modifyThread)<br/>
<span class="error">&#91;java&#93;</span> at java.security.AccessControlContext.checkPermission(AccessControlContext.java:264)<br/>
<span class="error">&#91;java&#93;</span> at java.security.AccessController.checkPermission(AccessController.java:427)<br/>
<span class="error">&#91;java&#93;</span> at java.util.concurrent.ThreadPoolExecutor.shutdown(ThreadPoolExecutor.java:894)<br/>
<span class="error">&#91;java&#93;</span> at clojure.lang.Agent.shutdown(Agent.java:34)<br/>
<span class="error">&#91;java&#93;</span> at clojure.lang.Compile.main(Compile.java:71)</p>
<p>I reproduced this on two Mac OS X 10.5 machines. I'm not aware of having any enhanced security policies along these lines on my machines. The compile goes fine for me with Java 1.6.0_0 on an Ubuntu box.</p><p>chouser@n01se.net said: I had only tested it on my ubuntu box &#8211; looks like that was openjdk 1.6.0_0. I'll test again with sun-java5 and sun-java6.</p><p>chouser@n01se.net said: 1.6.0_13 worked fine for me on ubuntu, but 1.5.0_18 generated an the exception Steve pasted. Any suggestions? Should this patch be backed out until someone has a fix?</p><p>achimpassen said: [<a href="file:aqn0IGxZSr3RUGeJe5aVNr">file:aqn0IGxZSr3RUGeJe5aVNr</a>]</p><p>chouser@n01se.net said: With Achim's patch, clojure compiles for me on ubuntu using java 1.5.0_18 from sun, and still works on 1.6.0_13 sun and 1.6.0_0 openjdk. I don't know anything about ant or the security error, but this is looking good to me.</p><p>achimpassen said: It works for me on 1.6.0_13 and 1.5.0_19 (32 and 64 bit) on OS X 10.5.7.</p><p>chouser@n01se.net said: (In [<span class="error">&#91;r:895b39dabc17b3fd766fdbac3b0757edb0d4b60d&#93;</span>]) Rev fa3d2497 causes compile to fail on some VMs &#8211; back it out. Refs #124</p>
<p>Branch: master</p><p>mikehinchey said: I got the same compile error on both 1.5.0_11 and 1.6.0_14 on Windows. Achim's patch fixes both.</p>
<p>See the note for "permissions" on <a href="http://ant.apache.org/manual/CoreTasks/java.html">http://ant.apache.org/manual/CoreTasks/java.html</a> . I assume ThreadPoolExecutor.shutdown is the problem, it would shutdown the main Ant thread, so Ant disallows that. Forking avoids the permissions limitation.</p>
<p>In addition, since the build error still resulted in "BUILD SUCCESSFUL", I think failonerror="true" should also be added to the java call so the build would totally fail for such an error.</p><p>chouser@n01se.net said: I don't know if the &lt;java fork=true&gt; patch is a good idea or not, or if there's a better way to solve the original problem.</p>
<p>Chas, I'm kicking back to you, but I guess if you don't want it you can reassign to "nobody".</p><p>richhickey said: Updating tickets (#8, #42, #113, #2, #20, #94, #96, #104, #119, #124, #127, #149, #162)</p><p>shoover said: I'd like to suggest an alternate approach. There are already well-defined and intuitive ways to block on agents and futures. Why not deprecate shutdown-agents and force users to call await and deref if they really want to block? In the pmap situation one would have to evaluate the pmap form.</p>
<p>The System.exit problem goes away if you configure the threadpools to use daemon threads (call new ThreadPoolExecutor and pass a thread factory that creates threads and sets daemon to true). That way the user has an explicit means of blocking and System.exit won't hang.</p><p>alexdmiller said: I blogged about these issues at:<br/>
<a href="http://tech.puredanger.com/2010/06/08/clojure-agent-thread-pools/">http://tech.puredanger.com/2010/06/08/clojure-agent-thread-pools/</a></p>
<p>I think that:</p>
<ul>
<li>agent thread pool threads should be named (see ticket <a href="https://www.assembla.com/spaces/clojure/tickets/378-set-thread-names-on-agent-thread-pools">#378</a>)</li>
<li>agent thread pools must be daemon threads by default</li>
<li>having ways to specify an customized executor pool for an agent send/send-off is essential to customize threading behavior</li>
<li>(shutdown-agents) should be either deprecated or made less dangerous</li>
</ul>
<p>Rich, what is the intention behind using non-daemon threads in the agent pools?</p>
<p>If it is because daemon threads could terminate before their work is complete, would it be acceptable to <a href="http://download.oracle.com/javase/1.5.0/docs/api/java/lang/Runtime.html#addShutdownHook(java.lang.Thread)">add a shutdown hook</a> to ensure against such premature termination? Such a shutdown hook could call <tt>Agent.shutdown()</tt>, then <a href="http://download.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/ExecutorService.html#awaitTermination(long, java.util.concurrent.TimeUnit)"><tt>awaitTermination()</tt></a> on the pools.</p><p>Moving this ticket out of approval "OK" status, and dropping the priority. These were Assembla import defaults.</p>
<p>Also, Chas gets to be the Reporter now.</p><p>Heh, blast from the past.</p>
<p>The comment import appears to have set their timestamps to the date of the import, so the conversation is pretty hard to follow, and obviously doesn't benefit from the intervening years of experience. In addition, there have been plenty of changes to agents, including some <a href="https://github.com/clojure/clojure/commit/f5f4faf">recent enhancements</a> that address some of the pain points that Alex Miller mentioned above.</p>
<p>I propose closing this as 'invalid' or whatever, and opening one or more new issues to track whatever issues still persist (presumably based on fresh ML discussion, etc).</p><p>Rereading the original description of this ticket, without reading all of the comments that follow, that description is still right on target for the behavior of latest Clojure master today.</p>
<p>People send messages to the Clojure Google group every couple of months hitting this issue, and one even filed <a href="http://dev.clojure.org/jira/browse/CLJ-959" title="after call to clojure.java.shell/sh, jvm won&#39;t exit"><del>CLJ-959</del></a> because of hitting it. I have updated the examples on ClojureDocs.org for future, and also for pmap and clojure.java.shell/sh which use future in their implementations, to warn people about this and explain that they should call (shutdown-agents), but making it unnecessary to call shutdown-agents would be even better, at least as the default behavior. It sounds fine to me to provide a way for experts on thread behavior to change that default behavior if they need to.</p><p>Patch clj-124-v1.patch dated Jul 31 2014 implements the approach of calling clojure.lang.Agent#shutdown when the new Var &#42;auto-shutdown-agents&#42; is true, which is its default value.</p>
<p>I don't see any benefit to making this Var dynamic. Unless I am missing something, only the root binding value is visible after clojure.main/main returns, not any binding that would be pushed on top of that if it were dynamic. It seems to require alter-var-root to change it to false in a way that this patch would avoid calling clojure.lang.Agent#shutdown.</p>
<p>This patch only adds the shutdown call to clojure.main#main, but can easily be added to the legacy_repl and legacy_script methods if desired.</p><p>Patch clj-124-daemonthreads-v1.patch dated Aug 23 2014 simply modifies the ThreadFactory so that every thread created in an agent thread pool is a daemon thread.</p>ApprovalVettedGlobal RankPatchCodeWaiting Onrichhickey