I've modified ikvmc to use IKVM.Reflection and largely rewritten ikvmstub to directly work with the ikvm internals instead of using the java reflection API. Both ikvmc and ikvmstub can now process assemblies independent from the .NET runtime they run on. This opens up the possibility to start investigating the possibility of Silverlight support.

Changes:

Drag-n-drop fix by Nat.

Fixed regression introduced in previous development snapshot, related to field accessors.

In November 2008 I introduced IKVM.Reflection.Emit, today I'm introducing IKVM.Reflection. It superseded IKVM.Reflection.Emit and also includes the ability to read managed assemblies. In addition, I've also added many other features that aren't directly needed for ikvmc, but are useful for other applications. Almost the complete reflection API has now been implemented and there are several API extensions to support managed PE features that reflection doesn't support (well).

Why?

When I started on IKVM.Reflection.Emit, it wasn't at all clear to me that it would be possible to re-implement the System.Reflection.Emit namespace without also re-implementing the System.Reflection namespace, but it turned out it was. I did run into a few snags, such as the inability to subclass Module and Assembly (this was fixed in .NET 4.0) and a couple of Mono bugs, but on the whole IKVM.Reflection.Emit was very successful. So why then re-implement the System.Reflection namespace as well? The main reasons are Silverlight and .NET 4.0. For ikvmc to be able to target versions of the runtime different from the one it is currently running on, it is necessary to avoid using System.Reflection, because System.Reflection can only ever work with the mscorlib version of the current runtime.

ikvmc & ikvmstub

In the coming time, I plan on integrating IKVM.Reflection into ikvmc (most of the work for this has already been done, if you've been following the ikvm-commit list, you may have seen some changes go in and wondered why they are necessary) and ikvmstub (I haven't started on this yet). Currently, ikvmstub uses java reflection to expose members of the .NET types in an assembly. I chose this because it was the easiest way to make sure that what ikvmstub generated matched the ikvm runtime behavior (because it simply used the ikvm runtime to do the mapping). There are two downsides to this approach. The first is the same as mentioned above with ikvmc, you can only generate mscorlib stubs for the runtime you're currently running on. The second is more philosophical, it introduces a cycle in the build process. To build the IKVM.OpenJDK.*.dll assemblies, you need mscorlib.jar and System.jar, but to generate these stubs you need a compiled class library. To solve both these issues, I plan to rewrite ikvmstub to work directly on the internal ikvm runtime representations (with conditional compilation, like ikvmc does).

Features

This list is not exhaustive, but here are some interesting features of IKVM.Reflection (that are not in System.Reflection):

No thread safety. If you want thread safety, you'll have to lock the universe object during every operation.

You can choose what version of mscorlib to load in the universe (using Universe.LoadMscorlib()) and by implementing a Universe.AssemblyResolve handler you can decide the framework assembly unification policy.

Support for querying and emitting .NET 2.0 style declarative security.

Support for defining unmanaged resources from a byte array (in .NET, AssemblyBuilder.DefineUnmanagedResources(byte[]) is broken, only the overload that accepts a filename works).

Support for reading field RVA data (e.g. for the fields that are used by the C# compiler to initialize arrays).

The ability to enable/disable "exception block assistance", or get "clever" assistance.

Support for querying methodimpl mappings.

Support for reference, pointer and array types with custom modifiers (this CLR feature is used by C++/CLI).

Missing Features

Some things are still missing. The most notable being the Emit differences. The emit code was based on the IKVM.Reflection.Emit code and likewise still lacks some of the querying support (for baked types), although the new code is much better than the code in IKVM.Reflection.Emit.dll in this respect.

Here's a list of methods that can still throw a NotImplementedException:

FieldBuilder.__GetDataFromRVA()

ModuleBuilder.ResolveType()

ModuleBuilder.ResolveMethod()

ModuleBuilder.ResolveField()

ModuleBuilder.ResolveMember()

ModuleBuilder.ResolveString()

ModuleBuilder.__ResolveOptionalParameterTypes()

ModuleBuilder.GetArrayMethod()

GenericTypeParameterBuilder.BaseType

GenericTypeParameterBuilder.__GetDeclaredInterfaces()

GenericTypeParameterBuilder.GetGenericParameterConstraints()

GenericTypeParameterBuilder.GenericParameterAttributes

TypeBuilder.CreateType() (when invoked a second time)

TypeBuilder.__GetDeclaredFields()

TypeBuilder.__GetDeclaredEvents()

TypeBuilder.__GetDeclaredProperties()

ISymbolDocumentWriter.SetCheckSum()

ISymbolDocumentWriter.SetSource()

Most methods in ISymbolWriter

ManifestResourceInfo.ResourceLocation (for resources located in another assembly)

AssemblyBuilder.DefineUnmanagedResource(byte[]) (because it is broken)

MethodBuilder.CreateMethodBody()

Everything that doesn't make sense in a ReflectionOnly context.

Concepts that are not implemented:

Most metadata tokens returned by Emit objects are not properly typed (and can't be used for anything, other than comparing them against other metadata tokens).

When defining debugging symbols, a single method can only point to a single source document.

All type/member lookup operations are case sensitive.

Implementing a custom System.Reflection.Binder is not supported.

Modules with unsorted metadata tables are not supported.

When Type.GetMethods() (or __GetDeclaredMethods) is called on Array types it throws a NotImplementedException, instead of returning the special array accessor methods.

Managed function pointer types are not supported. Like System.Reflection, they are returned as System.IntPtr instead.

Linker Prototype

To see if I did miss any important CLR features, I wrote a prototype assembly linker. It is pretty capable, but should not be confused for something that is usable for anything other than exploring. I've used it with C++/CLI (compiled with /clr:pure) to test the more esoteric CLR features. The source for the linker prototype is in the zip linked to below.

The IKVM.Reflection source code is available in cvs. If you just want the binary, the LinkerPrototype.zip contains it.

In Java static initializers can deadlock, on .NET some threads can see uninitialized state in cases where deadlock would occur on the JVM.

JNI

Only supported in the default AppDomain.

Only the JNICALL calling convention is supported! (On Windows, HotSpot appears to also support the cdecl calling convention).

Cannot call string contructors on already existing string instances

A few limitations in Invocation API support

The Invocation API is only supported when running on .NET.

JNI_CreateJavaVM: init options "-verbose[:class|:gc|:jni]", "vfprintf", "exit" and "abort" are not implemented. The JDK 1.1 version of JavaVMInitArgs isn't supported.

JNI_GetDefaultJavaVMInitArgs not implemented

JNI_GetCreatedJavaVMs only returns the JavaVM if the VM was started through JNI or a JNI call that retrieves the JavaVM has already occurred.

DestroyJVM is only partially implemented (it waits until there are no more non-daemon Java threads and then returns JNI_ERR).

DetachCurrentThread doesn't release monitors held by the thread.

Native libraries are never unloaded (because code unloading is not supported).

The JVM allows any reference type to be passed where an interface reference is expected (and to store any reference type in an interface reference type field), on IKVM this results in an IncompatibleClassChangeError.

monitorenter / monitorexit cannot be used on unitialized this reference.

Floating point is not fully spec compliant.

A method returning a boolean that returns an integer other than 0 or 1 behaves differently (this also applies to byte/char/short and for method parameters).

Synchronized blocks are not async exception safe.

Ghost arrays don't throw ArrayStoreException when you store an object that doesn't implement the ghost interface.

Class loading is more eager than on the reference VM.

Interface implementation methods are never really final (interface can be reimplemented by .NET subclasses).

JSR-133 finalization spec change is not fully implemented. The JSR-133 changes dictate that an object should not be finalized unless the Object constructor has run successfully, but this isn't implemented.

Static Compiler (ikvmc)

Some subtle differences with ikvmc compiled code for public members inherited from non-public base classes (so called "access stubs"). Because the access stub lives in a derived class, when accessing a member in a base class, the derived cctor will be run whereas java (and ikvm) only runs the base cctor.

Try blocks around base class ctor invocation result in unverifiable code (no known compilers produce this type of code).

Try/catch blocks before base class ctor invocation result in unverifiable code (this actually happens with the Eclipse compiler when you pass a class literal to the base class ctor and compile with -target 1.4).

Only code compiled in a single assembly fully obeys the JLS binary compatibility rules.

An assembly can only contain one resource with a particular name.

Passing incorrect command line options to ikvmc may result in an exception rather than a proper error messages.

Class Library

Most class library code is based on OpenJDK 6 build 16. Below is a list of divergences and IKVM specific implementation notes.

com.sun.security.auth.module

Not implemented.

java.applet

GNU Classpath implementation. Not implemented.

java.awt

Partial System.Windows.Forms based back-end. Not supported.

java.io.Console

Not implemented.

java.lang.instrument

Not implemented.

java.lang.management

Not implemented.

java.net

No IPv6 support implemented.

java.net.ProxySelector

Getting the default system proxy for a URL is not implemented.

java.text.Bidi

GNU Classpath implementation. Not supported.

java.util.zip

Partially based on GNU Classpath implementation.

javax.imageio.plugins.jpeg

Partial implementation. JPEGs can be read and written, but there is no metadata support.

javax.management

Not implemented.

javax.print

Not implemented.

javax.script

Not implemented.

javax.smartcardio

Not implemented.

javax.sound

Not implemented.

javax.swing

Not supported.

javax.tools

Not implemented.

org.ietfs.jgss

Not implemented.

sun.jdbc.odbc

Implementation based on .NET ODBC managed provider.

sun.net.www.content.audio

Audio content handlers not implemented.

sun.net.www.content.image

Image content handlers not implemented.

The entire public API is available, so "Not implemented." for javax.print, for example, means that the API is there but there is no back-end to provide the actual printing support. "Not supported." means that the code is there and probably works at least somewhat, but that I'm less likely to fix bugs reported in these areas.

In the weeks before PDC I've been working on compiling Eclipse with ikvmc. This works was triggered by Mainsoft's Eyal Alaluf who asked me to work on this and also provided a desperately needed starting point. I had wanted to do this for ages, but didn't feel like struggling with the Eclipse build system to figure out how to get started.

A couple of the changes in the most recent development snapshot are specifically related to this. In particular the ability for custom assembly class loaders to be called when the module initializer is run. This enables the statically compiled Eclipse OSGi bundles to be lazily activated on first use.

Run ikvmc to compile the eclipse plugins:ikvm\bin\ikvmc @response0.txtikvm\bin\ikvmc @response1.txt(Ignore the warnings and note that this takes a while and requires a lot of memory. I haven't tested this on a 32 bit machine, it may well run out of address space there.)

You can now run "eclipse-clr.exe" to start Eclipse. Note that if you compare startup times, the first time that Eclipse starts it does some initial configuration, so don't compare the first startup with the subsequent ones.

Optionally you can run ngen-all.bat to compile all assemblies to native code. Make sure that you have the x86 version of ngen.exe in your path. Note that this also takes a while.

Source Code

The sources for eclipse-clr.exe are in this Visual Studio 2008 solution. It's pretty small and most of what it does is configure and hook OSGi to change the bundle loading and initialization. If you want to build eclipse-clr.exe, you first have to run ikvmc on response0.txt, then build eclipse-clr.exe (it depends on the OSGi assembly built with response0.txt) and after that you can run ikvmc on response1.txt (it depends on eclipse-clr.exe, because that contains the custom assembly class loader used for the bundles).

The response0.txt and response1.txt files were generated from the OSGi manifests and if there is interest I can publish the source to that as well, but is pretty hacky.

Performance

When compiled to native with ngen, Eclipse starts up faster than with JDK 1.6 on my systems. In theory the private working set should also be significantly less, allowing multiple Eclipse instances to use far less memory.

Disclaimer

This is just a technology demonstration, not production code and has not been extensively tested.

Jon Skeet recently blogged about the performance of [ThreadStatic] versus the new .NET 4.0 ThreadLocal<T>. I was surprised to see that ThreadLocal<T> was faster than [ThreadStatic], because ThreadLocal<T> uses [ThreadStatic] as the underlying primitive.

How do you go from a static field to a per instance field? It's simple, once you think of it. You (ab)use generic types. Here's a simplified ThreadLocal<T>:

The real ThreadLocal<T> type in .NET 4.0 beta 2 is much more complex, because it has to deal with recycling the types and protecting against returning a value from a recycled type. It also uses a higher base counting system to number the types, the maximum number of types generated (per T) is 4096 in beta 2. After you allocate more than that, it falls back to using a holder type that uses Thread.SetData().

I'm not sure what to make of this. It's a clever trick, but I think it ultimately is too clever. I benchmarked a simpler approach using arrays (where each ThreadLocal<T> simply allocated an index in the [ThreadStatic] array) and it was a little bit faster and doesn't suffer from the downsides of creating a gazillion types (which probably take more memory and those types stay around until the AppDomain is destroyed).

Finally a tip for Microsoft, move the Cn types out of ThreadLocal, because currently they are also generic (due to the fact that C# automatically makes nested types generic based on the outer type's generic type parameters) and that is unnecessarily wasteful.

Optimized field reflection. We now delay creating the dynamic methods to access the field until after the field has been accessed a couple of times, this saves a lot of memory for fields that are only usused a few times.

On "Patch Tuesday" two weeks ago Microsoft released security bulletin MS09-061. This bulletin describes three issues, one of which I reported to Microsoft on September 12, 2008. I will describe the details of what is now known as CVE-2009-0091. I have no inside knowledge of the other two vulnerabilities.

As mentioned in the original blog entry, I found the bug while browsing the Rotor sources. Here's the fragment that caught my eye:

// This method will combine this delegate with the passed delegate // to form a new delegate. protected override sealedDelegate CombineImpl(Delegate follow) { // Verify that the types are the same... // Actually, we don't need to do this, because Delegate.Combine already checks this.// if (!InternalEqualTypes(this, follow)// throw new ArgumentException(...)

This is from multicastdelegate.cs (Warning: this link leads to Microsoft Shared Source licensed code).

The code that is commented out is a security check. After seeing this I immediately confirmed (using ildasm) that the, at that time current, production version of mscorlib also didn't include the check. I also checked .NET 1.1 and in that version the check is present. I also checked a pre-release version of Silverlight 2.0 and it also didn't include the check. The subsequent Silverlight 2.0 release on October 14, 2008 included the fix. Microsoft did not find it necessary to credit me with the fix (not even privately).

Why Is This a Security Vulnerability?

In my example type safety exploit, I used a union to bypass type safety. That wasn't an actual security vulnerability because such a union requires full trust. However, if we can combine two different delegate types we can do the same and because of the missing type check this was possible.

If you take TypeSafetyExploitPoC.cs and replace the TypeSystemHole method with the following and add a reference to an assembly containing CombinePoCHelper.il (written in MSIL because that is the easiest way to write your own MulticastDelegate subclass that can call the protected CombineImpl method).