While it's been quiet on the blog and the mailing list, there has been quite a lot of progress behind the scenes.

The next Mono release will contain the C half of the IKVM JNI provider and the next IKVM snapshot will contain the C# half of the Mono JNI provider. This means that JNI will work out of the box on Mono (for the parts of JNI that are actually implemented). Thanks to Zoltan and Miguel for this.

I'm planning an IKVM 0.8 release to coincide with the Mono 1.0 release.

John Luke added IKVM support to MonoDevelop. Read about it here or see the screenshot here.

I successfully started up Eclipse 3.0 M8 for the first time yesterday. Thanks to Michael Koch for his work on GNU Classpath's java.nio implementation and all the other GNU Classpath hackers, of course.

In the comments, Jesus Garcia point me to SwingWT. An SWT based implementation of AWT and Swing by Robin Rawson-Tetley. Very cool stuff! It doesn't run on the latest snapshot due to a JNI bug, but I have it running and it's very cool to see the SwingSet demo running on IKVM.

I hope to do a new snapshot in the first week of May and after that to work towards the 0.8 release.

The last snapshot was totally broken on Mono (*blush*). My apologies to the Mono users. I should test my snapshots on Mono before releasing them, but I'm lazy so this sometimes slips through the cracks.

The breakage was kind of interesting though. As I wrote last time, I rewrote the bytecode compiler to emit CIL in the same order as the Java bytecode. This caused invalid (per the ECMA spec) CIL to be generated in some cases. The interesting thing is that the code isn't really invalid, the only reason it is invalid is because the spec says so:

Partition III -- 1.7.5 Backward Branch Contraints

It must be possible, with a single forward-pass through the CIL instruction stream for any method, to infer the exact state of the evaluation stack at every instruction (where by “state” we mean the number and type of each item on the evaluation stack).

In particular, if that single-pass analysis arrives at an instruction, call it location X, that immediately follows an unconditional branch, and where X is not the target of an earlier branch instruction, then the state of the evaluation stack at X, clearly, cannot be derived from existing information. In this case, the CLI demands that the evaluation stack at X be empty.

(Note that the section numbering seems to change with each version, this is from the working draft of June 2003.)

Java bytecode has no such requirment, so my straight forward translation caused Java bytecode that is only reachable through a backward branch to be translated into invalid CIL.

The Microsoft verifier and JIT don't require this constraint to be met and they will happily verify and JIT code that violates this constraint. However, the Mono JIT relies on it and so it was unable to handle some of the CIL that IKVM generated.

I fixed the IKVM bytecode compiler to emit ECMA compliant code (at least for this particular issue, who knows what else is wrong). I also fixed exception mapping, which didn't work on Mono as it relied on a currently unimplemented feature (I think, I didn't really investigate).

I think that, realistically, the Mono JIT will also have to be "fixed" to support the broken code, as you can be sure that there will be compilers that emit broken code, because it works on the Microsoft runtime.

I finished and fixed the local variable analysis and made various improvements to the debugging experience (in Visual Studio .NET). In the process I discovered some quirks in jikes' debugging tables. It associates two line numbers with the same bytecode address when that bytecode address is also the start of an exception block. It also (I think incorrectly) starts a local variable scope before the store instruction that first initializes that local variable. Javac starts the local variable scope immediately after the first store to that local, this makes more sense to me, but also required a hack in the local variable debugging support to make sure that the IL variable scope starts at the right position (i.e. before the first store to that local variable).

To make local variable scopes work, I had to change the bytecode compiler to emit IL in the same order as the original Java bytecode. For some reason I originally wrote the compiler to do the same code flow analysis as the verifier and this resulted in a somewhat arbitrary order of the generated IL. I finally fixed that and I also removed the recursion from the compiler (it now uses an explicit stack to keep track of exception blocks).

Note that unreachable code is still not compiled (it can't be verified, so it can't be compiled), but there will be nop IL instructions for each unreachable bytecode instruction, so you can set breakpoints and move the instruction pointer there. This could be a bit confusing and I probably should fix it at some point (by not emitting the nops).

I also had to reintroduce the automatic downcasting of locals, because I realised that it is possible to write bytecode by hand that depends on this (Java source will never require it). In non-debug builds of classpath.dll this introduces 12 (unnecessary) downcasts, so that's not really worth optimising.

What's new?

When compiling with -debug option, dead store are not optimized out anymore.

When compiling with -debug option, local variables are merged based on LocalVariableTable information.

When compiling with -debug option, the first instruction of a method will always (except when compiling with -Xmethodtrace) have an associated line number now (to enable stepping into the method).

When compiling with -debug option, local variables are now scoped based on LocalVariableTable information. Different locals with the same name are now handled correctly in the debugger.

Made helper method MethodInfo caching more consistent.

Method arguments that are value types are now boxed on method entry and not on each individual load, this makes them behave consistent with local variables.

Reusing method argument local variable slots (with different types) is now fully supported. I've never seen a Java compiler do this, but it is legal.

lookupswitch/tableswitch branches into exception blocks are now handled correctly (exception block is split around branch target). This completes branch/exception block handling. I originally thought that this would never happen, but apparently it does.

Added a helper method to emit Ldc_I4 that always uses the optimal encoding.

Added a class loader check to the System.arraycopy optimization check, to make sure that we're in fact calling System.arraycopy on the bootstrap version of java.lang.System.

Many fixes to the local variable analysis.

Added -srcpath option to ikvmc. If specified, this prepends the specified path and the package name to the source file name in the debugging information. In most cases this will be the correct location for the source file and will allow the debugger to automatically load the correct source.

Given an instance of System.Type, how do I determine what the typename is in Java?

This is tricky and my (lame) answer is that you shouldn't have to. Do you have a specific scenario in mind? Having said that, in your comment you're on the right track, there is no one-to-one mapping between System.Type instances and java.lang.Class instances. What I think you're after is the fact that you'll never encounter cli.System.Object or cli.System.Exception as field types or in method signatures. The only way to get the class object for these types, is by calling getClass() on an instance or by calling Class.getSuperclass() on a subclass of them.

Note that the above behaviour isn't actually implemented yet. At the moment cli.System.Object and cli.System.Exception are simply helper classes with static methods and you'll never encounter them as field types, in method signatures or as instance types.

As to which one to use from Java, you're right when you say java.lang.* is the answer is most cases. The only times you want to use cli.System.Object or cli.System.Exception is when you subclass them from Java to make your Java class look more .NET like to .NET consumers.

What would be really nice would be if there were something in the IK.VM.NET.dll that would let me answer this question authoritatively with a simple call...

There is the NativeCode.java.lang.VMClass.getClassFromType() method that is used by the runtime internally (it's public, so you could call it), but I can't really guarantee that it'll stay around or behave consistently over time. At some point in the future there'll probably be a utility class in the ikvm.lang package that will contain conversion method to go from System.Type to java.lang.Class and vice versa (but it probably will require you to specify to context for the mapping).

Oh, one other thing that occurred to me would be a nice feature: if a class implemented in Java could put itself into the cli.* package and thereby make itself look to other Java code as if it were a .NET type, without actually putting it in a cli.* namespace in .NET. In other words, a Java class cli.Foo.Bar would be compiled as namespace Foo.Bar *without* the attribute that preserves its name in Java, so that Foo.Bar then gets translated back to cli.Foo.Bar when Java code sees it.

Sort of the inverse of the attribute for turning name mangling off on the .NET side.

Can you explain why and when you'd want to use this?

Mike asks:

Are there any tasks for a CS scrub like myself to work on with IKVM?

Here is a partial list of things that need to be done:

Implement the AWT peers

Implement the missing JNI methods

Build a framework to test the verifier / bytecode compiler

Write a Java implementation of String.valueOf(float) and String.valueOf(double)

Search the source for // TODO for things that seem doable (or write test cases that show the current code is broken)

Write documentation

Design a logo

Design a website

If you (or anyone) decides to work on something, please send e-mail to the ikvm-developers list, so we can coordinate.

Miguel posted a nice example of how to use Gtk# from Java using IKVM/Mono on his blog. In response Pablo posted a question to the Mono list and Jonathan Pryor replied with a nice explanation of how delegates are handled to the IKVM and Mono lists (quoted with permission, slightly edited):

From: Jonathan Pryor
Sent: Friday, March 19, 2004 02:35
To: Pablo Baena
Cc: Miguel de Icaza; mono-list@lists.ximian.com; ikvm-developers@lists.sourceforge.net
Subject: Re: [Mono-list] Java and C#
Below...
On Thu, 2004-03-18 at 16:07, Pablo Baena wrote:
> Miguel: I saw your blog about IKVM. One thing I haven't been able to
> investigate is, how useful can be Gtk# with Java. Because, for example, I
> couldn't find a clue on how to attach a Java 'listener' to a C# event, or any
> way to use attributes in Java.
They really need to document this better...
However, grepping through the ikvm.zip file (from their website), we
see:
// file: classpath/java/lang/VMRuntime.java
cli.System.AppDomain.get_CurrentDomain().add_ProcessExit (
new cli.System.EventHandler (
new cli.System.EventHandler.Method () {
public void Invoke (Object sender, cli.System.EventArgs e) {
Runtime.getRuntime().runShutdownHooks();
}
}
)
);
>From this (and prior knowledge), we can draw the following statements:
1. Properties are actually functions with `get_' and `set_' prefixed to
them. Thus C# property System.AppDomain.CurrentDomain is the static
Java function cli.System.AppDomain.get_CurrentDomain().
2. Events are actually functions with `add_' and `remove_' prefixed to
their name. Thus C# event System.AppDomain.ProcessExit is the static
Java function cli.System.AppDomain.add_ProcessExit().
3. There is no equivalent to C# delegates in Java, so these are
translated into a class + interface pair. The EventHandler class is the
standard C# type name (cli.System.EventHandler), which takes as an
argument an interface to invoke, named "cli." + C# delegate type name +
".Method", hence cli.System.EventHandler.Method. The EventHandler.Method
interface has a function Invoke() which must be implemented, and this
method will be invoked when the event is signaled.
I suspect that there is no way to add attributes in Java. Microsoft's
Visual J# permits the use of Attributes (IIRC), but it's through their
Visual J++ syntax -- through a specially formed JavaDoc comment.
Something like (from memory):
/**
* @attribute-name (args...)
*/
public void myMethod () {/* ... */}
Of course, that's compiler specific, and no standard Java compiler will
support that. So when it comes to attributes, you're probably up the
creek.
- Jon

I replied saying that I believe that the attribute construct in JDK 1.5 can probably be used to expose .NET attributes to Java (and use them in Java code that is target to run on IKVM).

In Tuesday's snapshot, ikvmc was completely broken. Sorry about that. The CoreClasses cache introduced an incorrect dependency between Object, Throwable and String. This caused Throwable or String to be loaded while it was being loaded and that resulted in an exception: System.ArgumentException: Item has already been added. Key in dictionary: "java.lang.Throwable" Key being added: "java.lang.Throwable"

Hopefully this snapshot will be a little better quality, but don't hold your breath, because the main change in this version is the addition of local variable liveness analysis to the verifier. This required some tricky code and made it clear to me (again) that the verifier desperately needs to be rewritten.

The trigger for the local variable liveness analysis was to be able to emit debugging information for local variables, but it also has the nice side effect of allowing a little better code generation. Previously, if a local variable slot was shared between two different reference types, the .NET local would have the type of the common base type, even if the uses were in fact totally distinct. The compiler had to emit downcasts whenever it emitted a load from one of those locals. In classpath.dll there were 1288 such downcasts. With the new liveness information, it is now possible to split those Java locals in multiple .NET locals, so these downcasts are now gone. Another optimization, which doesn't seem all that exciting, is the elimination of dead stores to local variables. In itself this is a fairly pointless optimization, because the CLR/Mono JIT will probably do it anyway. However, there is one very important optimization that can be done because of dead store elimination, in exception handling. Whenever an exception handler discards the exception object and the IKVM bytecode compiler can detect this, it can skip the (expensive) stack trace capturing that is normally required. I had already hacked some support to recognize these exception handlers (in classpath.dll there were 313 optimized exception handlers), but now it works much better (there are now 444 optimized exception handlers).

What's new?

Decorated the various ByteCodeHelper methods with the [DebuggerStepThroughAttribute] attribute to make stepping through the source code in the debugger less disruptive.

Restored the signature decoding methods in ClassFile.cs that I removed in the previous version. I had failed to realise that they're different from the ones in ClassLoaderWrapper, because they deal with unloadable classes.

Fixed CoreClasses to decouple the different classes (accessing one no longer triggers loading the others).

Changed handling of package accessible final fields (they're no longer turned into a property).

Fixed System.setOut (copy & paste mistake, it tried to set "in").

The debugging information is now classified as Java/Text. Not sure if this affects anything, but it seemed like the right thing to do.

Fixed debugging line number information to make sure the firt CIL instruction has a corresponding line number. Previously, Visual Studio .NET refused to step into an ikvmc compiled method.

Optimized dead stores to local variables and use new dead store information to optimize exception handling.

Local variables are now properly typed and have their names attached in debugging information (when ikvmc with the -debug option is used). NOTE: if the same variable name is reused in a method, the debugging information for those variables is not yet emitted correctly.

Fixed mapping of System.IntPtr to gnu.classpath.RawData. The mapping is now private to classpath.dll.

Changed JVM.CriticalFailure to write to always write to stderr on Unix instead of try to display a message box.

Fixed race condition between returning from Thread.join and the thread being removed from the thread pool / marked as dead.

Fixed Thread.yield to not consume thread interrupted status. (Note that Thread.sleep(0) behaves as Thread.yield() and also does not consume the interrupted status).

Last week I said I'd go through a stabilization phase, but I couldn't resist the urge to implement some more stuff and fix various things. So this snapshot is a fairly big change again, but no major architectural overhaul like the pervious one.

What's new?

Merged with current Classpath cvs.

Support for JDK 1.5 style class literals (only for class files with version 49 or greater).

Removed signature decoding from ClassFile.cs (I once thought that it should live there, instead of in ClassLoaderWrapper, but that turned out not to be a good idea).

Added CoreClasses.cs to cache a few of the frequently used TypeWrappers (Object, Class, String and Throwable).

Fixed volatile long/double handling to use the (new in .NET 1.1) Thread.VolatileRead/VolatileWrite methods.

Changed type used in ImplementsAttribute to the ghost wrapper for ghosts.

Changed method name mangling for interface implementation stubs (shorter name and now uses a slash to make sure it doesn't clash with any Java method names).

Added support for Finalize/finalize method overriding when mixing Java and non-Java classes in the class hierarchy. I don't like this solution very much. The code is ugly and complicated.

Added special support for finalize method for .NET types that extend Java types.

Fixed handling of synchronized static methods. Previously, .NET MethodImplOptions.Synchronized flag was simply set, but this was incorrect because that causes the method to synchronize on the .NET Type object, instead of the Java Class object.

Fixed handling of instance calls on value types.

Fixed System.currentTimeMillis implementation to use DateTime.UtcNow instead of Environment.TickCount, to prevent overflow.

Changed System.setErr/setIn/setOut to use TypeWrapper based reflection instead of .NET reflection.

Changed handling of resources to use .NET resources instead of global fields, this allows resources to work in multi-module assemblies.

Changed URL format for assembly embedded resources from opaque to parseable, to facilitate parsing them as a URI.

Added support for passing ghost references to methods in map.xml instructions.

Fixed a regression introduced in the previous snapshot, that caused exception mapping not to be invoked for catch(Throwable).

Limited fixes to get AWT working again (after Classpath AWT changes).

Declared String.equals and String.compareTo(Object) in map.xml to make reflection appearance identical to JDK.

Implemented JDK 1.4 String methods that rely on regular expressions (Classpath now has java.util.regex.* support, although not 100% compatible with the JDK).

Minor performance improvement in String.hashCode implementation. Oddly enough, by doing the length check in the for condition, instead of manually hoisting it out of the loop. Apparantly the CLR JIT recognizes this pattern and optimizes it better.

Yesterday I looked at the JDK 1.5 beta that Sun released recently. There appears not to be a complete list of changes to the VM yet and the only things I found were a few new modifier bits (that haven't yet stabilized) and the fact that class literals are finally supported in the VM. This is important for IKVM.NET, because it makes class literals in statically compiled code work better and more efficient.

For a quick refresher of how class literals are currently compiled, let's look at how the following class is compiled:

The amount of code generated is pretty bizarre. Note that this isn't Jikes' fault, there just isn't a way to do it better. Now, here is what it looks like compiled with javac from the 1.5 beta (specifying the -target 1.5 option):

This looks a lot better! No new bytecode instruction was added, instead the ldc instruction was modified to allow referencing a CONSTANT_Class_info . When the VM encounters this it loads the class and pushes the class object on the stack. I added support for this to IKVM.NET (not in cvs yet) in about 15 minutes. When JDK 1.1 was released (the first version to support class literals in the source), I wondered why they didn't add VM support at the same time, but fortunately they finally got around to it.

Trivia

If you looked closely at the Jikes generated code, you may have noticed that Jikes actually loads the string array class ("[Ljava.lang.String;") instead of java.lang.String. Why does it do this? It does this, because it correctly implements the JLS. The JLS says that class literals should not cause a class to be initialized. Doing a Class.forName() initializes the class, but when you initialize an array class you don't initialize the component type class. So this is a clever trick. Javac doesn't do this, so it (incorrectly) causes the class to be initialized.

IKVM.NET

Why does this change help statically compiled code in IKVM.NET? Performance is a bit better, but that's not the most important difference. The real benefit shows up when you statically compile code into multiple assemblies. If one assembly references a class in another assembly via a class literal, you'd better be sure that the referenced assembly is already loaded in the AppDomain, otherwise the IKVM.NET runtime is unable to find the class. In the new (JDK 1.5) way of references class literals, it is no longer opaque to ikvmc, so it can now compile the construct in such a way that the class literal causes the appropriate assembly to be loaded by the .NET runtime when it is executed.

StringBuilder

Something that struck me a funny is the new StringBuilder class that JDK 1.5 includes. It's almost identical to StringBuffer, except that it is not thread safe. If you look at the Rotor source code, you can see that the .NET StringBuilder also started life as StringBuffer. Now if the next version of .NET includes a thread safe version of StringBuilder and name it StringBuffer, we've come full circle

Stuart commented:I'm not convinced that cli.System.Object should be visible to Java at all. AIUI, Java code will never see instances of cli.System.Object, because all such objects appear to inherit from java.lang.Object instead.

If cli.System.Object *is* visible to Java code, it introduces a paradox: java.lang.Object inherits from cli.System.Object (per the way it's actually implemented) but cli.System.Object should appear to inherit from java.lang.Object (per Java's rule that *everything* inherits from java.lang.Object). Now, it may be possible to create magic glue code that inverts the apparent inheritance relationship like that, but do you really want to go there? :)

The inversion is exactly what I was thinking about. Stuart's analysis above contains a crucial mistake, java.lang.Object does not inherit from cli.System.Object. However, it is virtually impossible not to get confused about this stuff, so let's try to make the discussion a little easier by defining a naming convention:

java.lang.ObjectThis is the base class of all Java classes (as seen from the Java side of the world).

[classpath]java.lang.ObjectThis is an implementation artifact of IKVM, it is a .NET type that is used as the base type for all non-remapped Java classes.

System.ObjectThis is the base class of all .NET classes (as seen from the .NET side of the world).

cli.System.ObjectThis is the IKVM manifestation of the System.Object type on the Java side of the world.

The paradox is that [classpath]java.lang.Object inherits from System.Object and cli.System.Object inherits from java.lang.Object, but hopefully it is now clear that this isn't a problem. (BTW, one of the definitions of a paradox is "A seemingly contradictory statement that may nonetheless be true").

There are actually two reasons why I would want to do this:

If a Java class extends a .NET type (that was exported using netexp) you see both the virtual methods in java.lang.Object as well as the ones in System.Object that the class in question happens to override (a fairly arbitrary set). By introducing cli.System.Object as the penultimate base class for all .NET types, this can be made much more consistent. cli.System.Object would have final implementations for all the virtual methods in java.lang.Object (to make sure that the essentially non-existing methods don't get overridden) and it would introduce the real virtual methods of System.Object.

If you want to define your own "first class" .NET exception class in Java, you need to extend cli.System.Exception. In other words, it makes for a more powerful programming model to expose the remapped types in this way.

Yesterday I checked in a major change set that implements the new object model mapping infrastructure. Today I put the new snapshots online as well. The new implementation is about a thousand lines less code than the previous.

What's new

Many code changes to implement the new model.

When compiling classpath.dll, ikvmc now requires the -remap:map.xml option. This is the only time the mapping information is read from the XML. When code actually runs, or when other classes are compiled, the remapping information is read from custom attributes in classpath.dll.

Tracing infrastructure. Interesting points in the runtime now contain trace calls that can be enabled with a command line switch (or app.config setting). In addition, when Java code is compiled it can optionally be instrumented so that each method called writes its name and signature to the tracer. This has a big performance impact (it will be optimized a little bit in the future, but don't expect too much), so it is not enabled for classpath.dll, by default.

classpath.dll now contains the remapped types (java.lang.Object, java.lang.Throwable, java.lang.String and java.lang.Comparable). This means that if you want to create a Java like class in C# you can now extend java.lang.Object. Note however that you should never define your references as java.lang.Object, use System.Object instead. If you want to call a java.lang.Object instance method on a System.Object reference, use the corresponding static instancehelper method on java.lang.Object.

Finalization

From the Java side of the fence, finalization continues to work as it always has, but when C# code is subclassing Java code, you should use the C# destructor if you need finalization. If you override the finalize method, you run the risk that it isn't called (it only gets called if one of your Java base classes actually overrides it). The C# destructor does the right thing. If you use another .NET language, you have to override Finalize and make sure that you call the base class Finalize. More complicated mixed scenarios (e.g. Java code subclassing C# code that subclasses Java code) are not supported at the moment (wrt finalization, other aspects should work fine).

What's next?

It's not quite done yet, but I'll be going through a stabilization phase before making any more changes. I have some ideas for changes to the way the remapped .NET types appear on the Java side (e.g. should it be possible to extend cli.System.Object in Java?). There are also some optimizations that can be done and there still remains some restructuring to be done.

Snapshots

I've tested this snapshot pretty well, but considering the scale of the changes, I expect some regressions. Bug reports are appreciated (as always).

Next month I'm speaking again at the rOOts conference in Bergen, Norway, where I had a very good time last year. Come and say hi if you're there. Also, I'm happy to be speaking again at the excellent (and fun) Colorado Software Summit in Keystone, Colorado in October.