In my previous entry on AtomicInteger I showed timings on my single CPU ThinkPad which has a 1.7 GHz Pentium M. Today I investigated the performance on my iMac which has a 1.83 GHz Core Duo and like the ThinkPad runs Windows XP SP2.

I've also modified AtomicInteger to take advantage of Interlocked.Increment, Decrement and Exchange. Instead of following the reference implementation which builds everything on top of the compareAndSet operation.

Running the same test produces the following numbers:

Time (ms)

JDK 1.5 HotSpot Server VM

2282

JDK 1.5 HotSpot Client VM

2922

IKVM

2406

C#

1922

Note that the test is significantly slower than on the Pentium M. This is most likely because of the bus locking overhead.

It gets even more interesting when we modify the test to have two threads that concurrently increment the same field:

Time (ms)

JDK 1.5 HotSpot Server VM

22563

JDK 1.5 HotSpot Client VM

25109

IKVM

15016

C#

12000

The first thing to note is the fact that the test now takes an order of magnitude more time. Presumably, this is caused by the communication overhead between the two cores.

The second is that here IKVM significantly outperforms HotSpot Server. The reason for this is simple: IKVM uses Interlocked.Increment which ultimately uses the XADD x86 instruction that atomically performs the load/increment/store. By contrast, the AtomicInteger reference implementation uses the following loop to increment the value:

When another thread updates the field between the get and compareAndSet, the compareAndSet will fail and the loop will have to run for another iteration. When multiple CPUs are continuously trying to increment the value this is likely to happen relatively frequently.

Standard disclaimer: As always these microbenchmarks are designed to magnify a particular effect. Be careful about drawing overly large conclusions from them.

A new 0.28 release to fix two bugs reported by users. The first bug is a regression that caused native method resolution not to work for statically compiled code and the second bug is that instanceof didn't work with ghost interface arrays (e.g. obj instanceof Serializable[] would cause the runtime to abort). No other changes.

Last week Tom Tromey posted a proposal for a patch to the Classpath patches mailing list for JSR-166 support. Triggered by this I did some preliminary investigation to see how much work it will be to support this in IKVM.

The reference implementation uses the sun.misc.Unsafe class to calculate the offset of a field in a class and to directly access this field using the appropriate memory model semantics. HotSpot has intrinsic support for the performance cricitical methods in sun.misc.Unsafe. Unfortunately, this model isn't very suitable for IKVM, because the CLI does not specify a way to portably obtain the offset of a field in a class (and it even allows the offset to change over time). I've implemented all the required methods in sun.misc.Unsafe based on reflection, but obviously this is very slow (preliminary measurements show a 10x to 100x slowdown compared to JDK 1.5).

The good news is that some important classes can easily be ported to a more CLI friendly approach. As an example I've modified the reference implementation of AtomicInteger to use System.Threading.Interlocked.CompareExchange() instead of sun.misc.Unsafe.compareAndSwapInt(). This is a small change and allows for good performance. Here is the benchmark I used:

A long standing feature request for IKVM has been to make Java collections implement System.IEnumerable and I've always resisted doing that for various reasons.

I while ago -- while discussing this with Stuart Ballard -- I realised that with the proposed C# 3.0 language extensions, in particular "extension methods" it should be possible to make Java collections foreach-able with little effort:

For those who don't know how extension methods work, basically the above code (note the this keyword before the argument type) adds a method to java.util.Collection that feels like an instance method.

Given the way foreach works, I had expected that it too would recognize this GetEnumerator method and therefore work on all Java collections, but alas that is not the case.If you want to see this changed before C# 3.0 is released, please vote. Alternatively, if you're Miguel and on the ECMA C# committee, it should be obvious what to do

A new release candidate based on GNU Classpath 0.91. I did some perf work and a fair bit of restructuring in the way class loaders work, to better support ikvmc and ikvmstub on assemblies loaded in the ReflectionOnly context (on .NET 2.0). The perf work resulted in a significantly improved startup time for Eclipse (IKVM 0.26: 1 minute 23 seconds, IKVM 0.28: 55 seconds).

Updated japi results are available here. The comparison is against JDK 1.5 now and ignores the generics metadata.

Implemented socket timeout by using Socket.Poll() instead of setting the timeout on the underlying socket (because .NET 2.0 considers the socket broken after a Read timed out).

Removed classLoader field from TypeWrapper and made GetClassLoader() abstract (to enable subclasses to use a more efficient way to store the class loader).

Moved creation of java.lang.Class object from TypeWrapper constructor (because it would trigger a call to GetClassLoader before that was ready to return meaningful data and also because it wasn't semantically correct to expose the TypeWrapper instance before it was fully constructed).

Added ikvm.lang.Internal annotation to mark types and members as internal to the assembly (supported by ikvmc only, not in dynamic mode.) Note that ikvm.lang.Internal should only be applied to public types and members. Note also that to reflection these internal types and members will appears as package private.

Removed map.xml hacks to work around cross package accessibility issues (now done by marking the appropriate types/members public and annotating them with @ikvm.lang.Internal).

Hopefully someday Java will support more than the current simplistic scheme where classes are either public or non-public. In the meantime, I've added support to ikvmc for marking classes (and members) as internal (i.e. private to the assembly they live in). By marking a public class with the @ikvm.lang.Internal annotation ikvmc (and IKVM.Runtime.dll) will only allow access to the class by other code also living in the same assembly. I've also added a switch to ikvmc for making an entire package tree internal (the switch is named -privatepackage:<prefix>).

In addition, I've now enabled 1.5 support in the mainline version. Many of the 1.5 APIs that previously were only available on the GNU Classpath generics branch have been ported to the trunk, so it is starting to make more sense to enable 1.5 support in the VM by default.

The code is in cvs, although SourceForge has been having some problems, so it might not yet be available thru anoncvs. Development snapshot binaries are available here.

Yesterday I thought of a potential micro optimization for the generated CIL code. Since I'm now tracking method invocations on the this reference, it is possible to devirtualize calls to final methods or methods that are provably not overridden. I didn't expect too much from this optimization, because I assumed that the .NET JIT is capable of doing the same optimization and so the only effective difference would be the removal of the explict null check of the this reference.

Somewhat to my surprise neither JDK 1.5 Client VM nor the .NET Framework JIT realised that the virtual method invocation could be devirtualized. Unsurprisingly, HotSpot Server VM was able to deduce that the loop didn't actually do anything at all and optimize it away entirely.

I also threw in JDK 1.1, because as usual in microbenchmarks it outperforms HotSpot Client VM.

Note that the use of the equals method in the benchmark is not very significant, it's just a placeholder for a simple method that can easily be inlined. This optimization is most significant for simple property getters that can be inlined.

The code for this optimization is not yet in cvs.

Oops

On a more embarrassing note, doing this optimization also revealed a hole in the object model remapping. Throwable doesn't properly inherit all the Object methods. The code that gets generated works correctly (in most cases), but it is not verifiable. Here's an example of something that is currently broken: