you will also need to hardcode the HDFS URL in the core-site.xml by hand. This is not an issue since the same config should be deployed on all nodes in the cluster. Same pattern can be applied to mapred-site.xml - I did not need it.

Here is a little code-snippet (actually a ready to run jUnit TestCase) which might come handy if you need a fairly open ThreadPool not primarily limited by the number of active threads but rather by a predicted load factor. Latter one might be pretty much everything such as CPU load or a total number of "items" allowed to be processed by the whole ThreadPool at a given time.

If the predicted load is not dynamic enough for you, you might want to add another monitoring thread looking at some indicators (CPU, RAM, I/O) and adjust the LoadTracker's currentLoad value accordingly. Another path would be to skip the monitoring thread and extend the canHandle(load) method of the LoadTracker to respect the current indicator states.

Oh, and please let me know if I am reinventing the wheel, sometimes it is difficult not to.

In retrospect, same pattern could be applied to the Queue beneath the ThreadPool by coupling a LoadTrackableJob with a specific BlockingQueue. I guess you can always make the code / architecture prettier.

So, here it is. Thanks to http://isitdarkoutside.com you will never have to trouble yourself with the question "is it dark outside?" anymore. This super-advanced Android application will let you focus on more important questions such as "which shoe goes on which foot?" from now on. Thank us later.

Just in case you run into OutOfMemory Exceptions while requesting a large data chunk from the MySQL: the JDBC driver will load ALL (yes, ALL) rows before passing it to your fancy, agile and low-footprint routine. Tweaking the fetchSize property of a statement won't do any good either... well, not without some voodoo. So, here is how you can get the JDBC driver to get you a nice and tight StreamingResultSet:

Oh, and you clowns out there saying "this is normal, just bump up the memory settings for your JVM" - are you NUTS!? Or do you just like your applications exploding out of nowhere after being in production for some time?

If you need to let your Tomcat write access logs while being proxied by Apache's mod_proxy using the combined format, you will soon notice the lack of the client IP address. Pretty useless if you would like to get some access statistics for that particular instance.

Fortunately the Apache's mod_proxy will add some extra headers with the missing information to each request. Just set up your Log-Configuration in Tomcat as follows: