Tuesday, April 22, 2008

The factorial of a number is the product of all positive integers less than or equal to it. e.g. the factorial of 5 is 5 x 4 x 3 x 2 x 1 = 120.

While conducting a Java interview, I like to ask my candidates to write an algorithm to calculate the factorial of a number. This is one of my favourite questions because it gives me an insight into their thinking process. It is also one of the questions that I was asked at Merrill Lynch, in my first interview out of university. (I got it right)

I am surprised at how the majority of people cannot answer this question, or get anywhere close to it. C'mon, you can do it using recursion or loop-iteration; its not that difficult! If you can't count backwards, do it forwards - its the same thing.

Saturday, April 19, 2008

A few weeks ago, I had the opportunity of attending an interesting presentation by Sun's Brian Goetz on the new concurrency features in JDK 7 which will allow applications to take advantage of idle CPU cores. Here is a summary of how this works:

In the recent past we have stopped seeing increases in CPU clock rates. We have started getting more cores per chip rather than faster cores. This brings new challenges in programming. We need to learn the art of decomposing our applications into units of work that can be executed in parallel over multiple cores.

Enter the fork-join framework - a library to provide better support for fine-grained parallelism. The most promising candidates for parallelism are sorting and searching. Sorting can be parallelised with merge sort. Searching can be parallelised by searching sub-regions of the data in parallel and then merging the results. This is an example of divide-and-conquer, in which a problem is broken down into sub-problems which are solved and the results combined. In merge sort, we divide the data set into pieces, sort the pieces and merge the results. Result is still O(n log n), but sub-problems can be solved in parallel. We can apply this process recursively, until sub problems are small enough that a sequential solution would be faster.

The INVOKE-IN-PARALLEL step creates two or more new tasks (fork) and suspends the current task until the new tasks complete (join). Naive implementations would create a new thread for each task and then invoke Thread.join() for the join operation. But thread creation is expensive and requires O(log n) idle threads. A non-naive implementation is available in package java.util.concurrent.forkjoin in JDK 7 or you can download jsr166y.

The fork-join framework is implemented using work stealing. You create a limited number of worker threads (best choice is Runtime.availableProcessors()). Each worker maintains a double-ended work queue (deque). When forking, worker pushes the new task at the head of its deque. When waiting or idle, worker pops a task off the head of its deque and executes it instead of sleeping. If worker's deque is empty, steals an element off the tail of the deque of another randomly chosen worker. This means that there is never any contention for head and almost never contention for tail. Stealing is infrequent - because order is LIFO, when a worker steals, it generally steals a big chunk which will keep it from having to steal again for a while.

Then pool.invoke() is called, task is placed on a random deque. That worker executes the task and usually just pushes two more tasks onto its deque. It then starts on one of the subtasks. Soon some other worker steals the other top-level subtask. Pretty soon, most of the forking is done and the tasks are distributed among the various work queues. Now the workers start on the sequential subtasks. If work is unequally distributed, corrected via stealing. The result is that there is reasonable load balancing, with no central coordination, little scheduling overhead and minimal synchronisation costs.

We can reduce the amount of boilerplate code in fork-join tasks, by using the ParallelArray classes which let you declaratively specify aggregate operations on data arrays. ParallelArrays use fork-join to efficiently execute on the available hardware.

Friday, April 18, 2008

The purpose of Java's Garbage Collector is to find and reclaim unreachable objects. This post will show you how you can write readable and clean code that makes the most out of the garbage collector (in terms of throughput, responsiveness etc)

Do not be afraid to allocate small objects for intermediate results. Typically, object allocation is very cheap and reclamation of new objects is very cheap too! GCs love small, immutable objects and generational GCs love small, short-lived objects. You should use short-lived immutable objects instead of long-lived mutable ones. Use clearer, simpler code with more allocations instead of more obscure code with fewer allocations.

Avoid large objects as they are expensive to allocate and initialise.

Nulling references rarely helps the GC. Instead it creates clutter and in the worst case may introduce a bug. The JIT can do liveness analysis very well.

Avoid explicit GCs. Applications do not have enough information, but GC does. Calling System.gc() at the wrong time, may start a full stop-the-world GC (in HotSpot) which can hurt performance with no benefit.

Avoid frequent resizing of array-based data structures as this will allocate several large-ish arrays and will cause a lot of array copying. Try to size them, on initialisation, as realistically as possible.

Only use object pools for objects that are expensive to allocate/initialise or for those that represent scarce resources e.g. database connection pools or thread pools. Use existing libraries wherever possible. Unused objects in pools are bad; they are live and so the GC must process them and they provide no benefit to the application.

Use finalization as a last resort. Finalizable object allocation is much slower and reclamation takes at least two GC cycles; the first cycle identifies object as garbage and enqueues object on finalization queue, the second cycle reclaims space after finalize() completes. Finalizers are not like C++ destructors! There is no guarantee if or when they will be called.

Memory leaks occur because objects are reachable but unused. This may be due to objects in the wrong scope, lapsed listeners, exceptions changing control flow, instances of inner classes, reference objects etc. You should perform reachability analysis to clear up these objects.

Subscribe to fahd.blog

Hi, I'm Fahd, a software developer at an investment bank in London. I am passionate about technology and work mainly with open source software, specialising in Java applications and Unix-based operating systems.

This blog is a place for me to share useful code snippets to solve problems that I have come across, and to write about ideas and experiences as a programmer.

All code on this blog has been written by me, unless stated otherwise, and you are free to use, share and adapt it for any purpose, under the terms of the GNU General Public License.

I love hearing back from my readers, so please feel free to leave comments! Thanks for reading and happy programming :-)