Sunday, December 5, 2010

I recently realized that it's been over a year since I last put out a stable Sasa release. Sasa is in production use in a number of applications, but the stable releases on Sourceforge have lagged somewhat, and a number of fixes and enhancements have been added since v0.9.2.

So I decided to simply exclude the experimental and broken abstractions and push out a new release so others could benefit from everything Sasa v0.9.3 has to offer.

The changelog contains a full list of changes, too numerous to count. I'll list here a few of the highlights.

IL Rewriter

C# unfortunately forbids certain types from being used as generic type constraints, even though these constraints are available to CIL. For instance, the following is legal CIL but illegal in C#:

public void Foo<T>(T value) where T : Delegate{ ...}

Sasa now provides a solution for this using its ilrewrite tool. The above simply uses the Sasa.TypeConstraint<T> in the type constraint and the rewriter will erase all references to TypeConstraint leaving the desired constraints in place:

The /verify option runs peverify to ensure the rewrite produced verifiable IL. Pass /Debug if you're rewriting a debug build, and /Release if you're rewriting a release build. I have it set up as a Visual Studio post-build event, so I call it with /$(ConfigurationName) for this parameter.

Thread-safe and Null-Safe Event Handling

The CLR provides first-class functions in the form of delegates, but there are various problems that commonly creep up which Sasa.Events is designed to fix:

Invoking delegates is not null-safe

Before invoking a delegate, you must explicitly check whether the delegate is null. If the delegate is in a field instead of a local, you must first copy the field value to a local or you leave yourself open to a concurrency bug, where another thread may make the field null between the time you checked and the time you call it (this is commonly known as a TOCTTOU bug, ie. Time-Of-Check-To-Time-Of-Use).

This involves laborious and tedious code duplication that the C# compiler could have easily generated for us. The Sasa.Events.Raise overloads solve both of the above problems. Instead of:

var dlg = someDelegate;if (dlg != null) dlg(x, y, z);

you can simply call:

someDelegate.Raise(x, y, z);

This is null-safe, and thread-safe.

Event add/remove is not thread-safe

Events are a pretty useful idiom common to .NET programs, but concurrency adds a number of hazards for which the C# designers provided less than satisfactory solutions.

For instance, declaring a publicly accessible event creates a hidden "lock object" that the add/remove handlers first lock before modifying the event property. This is not only wasteful in memory, it's also expensive in highly concurrent scenarios. Furthermore, this auto-locking behaviour is completely different for code residing inside the class as compared to code outside the class. Needless to say, this unnecessarily subtle semantics was constantly surprising C# developers.

Enter Sasa.Events.Add/Remove. These overloads accept a ref to a delegate field, and perform an atomic compare and exchange on the field directly, eliminating the need for lock objects, and providing more scalable event registration/unregistration. Code that looked like this:

event Action fooEvent;...fooEvent += newHandler;

or like this:

event Action fooEvent;...lock (this) fooEvent += newHandler;

can now both be replaced by this:

Action fooEvent;...Events.Add(ref fooEvent, newHandler);

This code has less overhead than the standard event registration code currently generated by any C# compiler, in both concurrent and non-concurrent settings.

Safe, Statically-Typed and Blazingly-Fast Reflection

Reflection is incredibly useful, and incredibly dangerous. You are forced to work with your objects as untyped data which makes it difficult to write correct programs, and the compiler can't help you.

Most operations using reflection are functions operating over the structure of types. To make reflection safe, we only need a single reflective function that breaks apart an object into a stream of field values. The client then provides a stream processing function (the reflection function) that handles all the type cases that it might encounter.

The client need only provide an implementation of IReflector, which defines a callback-style interface completely describing the CLR's primitive types and providing you with an efficient ref pointer to the field's value for get/set purposes, and a FieldInfo instance providing access to the field's metadata:

The compiler ensures that you handle every case in IReflector. You handle non-primitive objects in IReflector.Object<T>, by recursively calling DynamicType.Reflect(field, this, fieldInfo).

Type<T> and DynamicType use lightweight code generation to implement a super-fast dispatch stub that invokes IReflector on each field of the object. These stubs are cached, so over time the overhead of reflection is near-zero. Contrast to the typical reflection overheads, and not only is this technique safer, it's significantly faster as well.

Extensible, Statically-Typed Turing-Complete Parsing

I covered the implementation of the Pratt parser before, and the interface has changed only a little since then. Pratt parsing is intrinsically Turing complete, so you can parse literally any grammar. The predefined combinators are for context-free grammars, but you can easily inject custom parsing functions.

What's more, each grammar you define is extensible in that you can inherit from and extend it in the way you would any other class. Here is a grammar from the unit tests for a simple calculator:

MIME Parsing

The Sasa.Net.Mail namespace in the Sasa.Net assembly, contains functions for parsing instances of System.Net.Mail.MailMessage from strings, including attachments in every encoding I've come across in the past few years. This code has been in production use in an autonomous e-mail processing program which has processed tens of thousands of e-mails over many years, with very few bugs encountered.

It can also format MailMessage instances into string form suitable for transmission over texty Internet protocols.

Deprecated

Unfortunately, the efficient, compact binary serializers from the last release have been deprecated, and the replacements based on Sasa.Dynamics are not yet ready. The ASP.NET page class that is immune to CSRF and clickjacking attacks first released in v0.9 has been removed for now as well, since it depended on the compact binary serializer.

I have plenty of new developments in the pipeline too. 2010 saw many interesting safety enhancement added to Sasa as outlined above, and 2011 will be an even more exciting year I assure you!

Edit: the original download on sourceforge was missing the ilrewrite tool. That oversight has now been addressed.

Friday, November 26, 2010

I just committed an implementation of Arrows to my open source Sasa library. It's in its own dll and namespace, Sasa.Arrow, so it doesn't pollute the other production quality code.

The implementation is pretty straightforward, and it also supports C#'s monadic query pattern, also known as LINQ. It basically boils down to implementing combinators on delegates, like Func<T, R>. It's not possible to implement the query pattern as extension methods for Func<T, R> because type inference fails for even the simplest of cases. So instead I wrapped Func<T, R> in a struct Arrow<T, R>, and implemented the query pattern as instance methods instead of extension methods. This removes a number of explicit type parameters that the inference engine struggles with, and type inference now succeeds.

Of course, type inference still fails when calling Arrow.Return() on a static method, but this is a common and annoying failure of C#'s type inference [1].

What is this madness?

Some might question my sanity at this point, since arrows in C# are bound to be rather cumbersome. I have a specific application in mind however, and experimenting with that led naturally to arrows. Basically, while trying to use Rx.NET in a highly configurable and dynamic user interface library, I became dissatisfied with the state management required.

In short, Rx.NET supports first-class signals, which does not play well with garbage collection. They solve this by reifying subscriptions in IDisposable objects that ensure proper cleanup if a signal is no longer required. So every time-varying value in my UI controls now requires me to keep track of two objects, the signal itself, and the IDisposable subscription to prevent it from being garbage collected.

Now consider all the elements of a text box or data grid that may be changing over time, including the text font, the margins, the position, the background, and so on, and you quickly see the state management problem grow.

Arrows can simplify this situation considerably, because instead of programming directly with signals, the user instead programs with signal functions. Since signals are no longer first-class values, there is no garbage collection problem and no need to juggle subscriptions.

There are further advantages in sharing, particularly for this UI library, so there's a great deal of incentive to use arrows. I'm hoping I can hide the use of arrows behind the user interface abstractions so the user has minimal interaction with it.

Saturday, November 6, 2010

A well designed core library is essential for building concise, maintainable programs in any programming language. There are many common, recurrent patterns when writing code, and ideally, these recurring uses should be factored into their own abstractions that are distributed as widely as possible throughout the core library.

However, the common interface of a value encapsulated in an object has not been factored out into a common interface in the .NET Base Class Library (BCL). This means one cannot write programs that are agnostic over the type of a value's container, resulting in unnecessary code duplication.

A legitimate argument against this approach is that the containers each have different semantics. For instance, accessing Lazy.Value will block until the value becomes available, but Nullable.Value always returns immediately.

Fortunately, this is not an argument against factoring out the "encapsulated value" pattern, but an argument for another interface that exposes these semantics. In this case, the new pattern is an "optional value":

Lazy, Nullable and Task all exhibit these exact semantics. Programs can then be written that are agnostic over how optional values are encapsulated and processed, and the common interfaces ensure the different behaviours are overloaded in a consistent, familiar way.

We can extend this even further to "mutable encapsulated values, aka, references":

This pattern is less common, but still quite prevalent. For instance, see ThreadLocal<T> (which could also implement IOptional and IVolatile incidentally).

These interfaces have been in the Sasa library for quite some time, and are used consistently throughout the entire library. The consistency has helped considerably in guiding the design of new abstractions, and clarifying their use, since developers can simply understand any new abstraction in terms of the familiar interfaces it implements.

I suppose the lesson to take from all this is to hunt down common patterns, and aggressively factor them out into reusable abstractions. This helps the library's consistency, thus helping clients learn your API by reducing the number of unnecessary new properties and methods.

Thursday, September 23, 2010

There are numerous high-level abstractions available in other languages that simply make programming easier and less error prone. For instance, automatic memory management, pattern matching, exceptions, higher order functions, and so on. Each of these features enable the developer to reason about program behaviour at a higher level, and factor out common behaviour into separate but composable units.

For fun, I've create a few small macro headers that enable some of these patterns in pure C. If anyone sees any portability issues, please let me know!

libex: Exception Handling and RAII

RAII in C is definitely possible via a well-known pattern used everywhere in the Linux kernel. It's a great way to organize code, but the program logic and finalization and error logic are not syntactically apparent. You have to interpret the surrounding context to identify the error conditions, and when and how finalization is triggered.

To address this, I encapsulated this RAII pattern in a macro library called libex, with extensions to support arbitrary local exceptions, and a small set of pre-defined exception types. Currently, this just consists of more readable versions of the error codes in errno.h.

No setjmp/longjmp is used, and libex provides little beyond case checking and finalization, because I wanted to provide a zero overhead exception handling and RAII that can supplant all uses of the undecorated pattern. Replacing all instances of the RAII pattern in Linux with these macro calls would incur little to no additional overhead, as it compiles down to a small number of direct branches.

There are also some convenience macros for performing common checks, like MAYBE which checks for NULL, ERROR which checks for a non-zero value, etc.

Functional languages have long enjoyed the succinct and natural construction and deconstruction of data structures via sum types and pattern matching. Now you can have some of that power via a few simple macros:

There are a few small requirements and caveats, eg. LET performs dynamic memory allocation. Please see the main libsum page for further details.

License

My default license is LGPL, but since these are macro libraries that's probably not appropriate choice, given there is no binary that can be replaced at runtime (one of the requirements of the LGPL). I like the freedoms afforded by the LGPL though, so I'm open to alternate suggestions with similar terms. I will also consider the MIT license if there are no viable alternatives.

Monday, May 31, 2010

The CLR's lightweight code generation via DynamicMethod is pretty useful, but it's sometimes difficult to debug the generated code and ensure that it verifies. In order to verify generated code, you must save the dynamic assembly to disk and run the peverify.exe tool on it, but DynamicMethod does not have any means to do so. In order to save the assembly, there's a more laborious process of creating dynamic assemblies, modules and types, and then finally adding a method to said type.

This difficulty in switching between saved codegen and pure runtime codegen led me to add a CodeGen class to Sasa, which can generate code for either case based on a bool parameter. Since no common interface is available for code generation, it also accepts a delegate to which it dispatches for generating the code:

/// <summary>/// Create a dynamic method./// </summary>/// <typeparam name="T">The type of the dynamic method to create.</typeparam>/// <param name="type">The type to which this delegate should be a member.</param>/// <param name="methodName">The name of the delegate's method.</param>/// <param name="attributes">The method attributes.</param>/// <param name="saveAssembly">Flag indicating whether the generated code should be saved to a dll.</param>/// <param name="generate">A call back that performs the code generation.</param>/// <returns>An dynamically created instance of the given delegate type.</returns>public static T Function<T>( Type type, string methodName, MethodAttributes attributes, bool saveAssembly, Action<ILGenerator> generate) where T : TypeConstraint<Delegate>;

You can also see Sasa's ilrewrite tool at work here with the T : TypeConstraint<Delegate>. This function will generate either a DynamicMethod or a dynamic assembly and save that assembly to disk, based on the 'saveAssembly' parameter. The assembly name is generated based on the type and methodName parameters.

In debugging the Sasa.Dynamics reflection code, I also came across a strange error which was not adequately explained anywhere that I could find. peverify.exe spit out an error to the effect of:

[X.dll : Y/Z][offset 0x0000001D] Unable to resolve token

Where X is the name of the generated dll, Y the namespace path, and Z is the class name. In my case, this occurred when the dynamically generated code was referencing a private class, which should not be possible from a separate dll.

Most framework-style software spends an appreciable amount of time dynamically loading code. Some of this code is executed quite frequently. I've recently been working on a web framework where URLs map to type names and methods, so I've been digging into these sort of patterns a great deal lately.

The canonical means to map a type name to a System.Type instance is via System.Type.GetType(string). In a framework which performs a significant number of these lookups, it's not clear what sort of performance characteristics one can expect from this static framework function.

Here's the source for a simple test pitting Type.GetType() against a cache backed by a Dictionary<string, Type>. All tests were run on a Core 2 Duo 2.2 GHz, .NET CLR 3.5, and all numbers indicate the elapsed CPU ticks.

Type.GetType()

Dictionary<string, Type>

6236070640

51351056

6236193856

51440360

6237466224

51463192

6238210488

51583336

6240645816

51599480

6242089400

51687448

6244450392

51719808

6245201664

51757472

6248327048

51793696

6249253736

51800056

6250640672

51859704

6251133912

51885992

6253544768

51897264

6254336632

51946408

6255117872

52046512

6256060648

52106936

6256159176

52140984

6259453568

52391000

Average

6247464250.67

51803928

Each program was run 20 times, and the resulting timing statistics were run through Peirce's Criterion to filter out statistical outliers.

You can plainly see that using a static dictionary cache is over two orders of magnitude faster than going through GetType(). This is a huge savings when the number of lookups being performed is very high.

Edit: Type.GetType is thread-safe, so I updated the test to verify that these performance numbers hold even when locking the dictionary. The dictionary is still two orders of magnitude faster. There would have to be significant lock contention in a concurrent program to justify using Type.GetType instead of a dictionary cache.

I had recent need for a transpose operation which could swap the columns and rows of a nested IEnumerable sequence, it's simple enough to express in LINQ but after a quick search, all the solutions posted online are rather ugly. Here's a concise and elegant version expressed using LINQ query syntax:

It simply numbers the columns in each row, flattens the sequence of cells, and groups the entries by number. If the table has entries that are missing, this algorithm has the side-effect of compacting all entries so that only the last row or column will be missing the elements. This may or may not be suitable for your application.

Monday, May 17, 2010

Most abstractions of interest have a natural dual, for instance, IEnumerable and IObservable, induction and co-induction, algebra and co-algebra, objects and algebraic data types, message-passing and pattern matching, etc.

Programs are more concise and simpler when using the proper abstraction, be that some abstraction X or its dual. For instance, reactive programs written using the pull-based processing semantics of IEnumerable are far more unwieldy than those using the natural push-based semantics of IObsevable. As a further example, symbolic problems are more naturally expressed via pattern matching than via message-passing.

This implies that any system is most useful when we ensure that every abstraction provided is accompanied by its dual. This also applies to virtual machine instruction sets like CIL, as I recently discovered while refining my safe reflection abstraction for the CLR.

A CIL instruction stream embeds some necessary metadata required for the CLR's correct operation. Usually this metadata is type information of some sort, such as the type of a local or the name and/or signature of a target method. For instance, here's a Hello World! example:

Unfortunately, the CIL instruction set suffers from some asymmetries which make some types of programming difficult.

For example, the ldtoken instruction takes an embedded metadata token and pushes the corresponding runtime type handle onto the evaluation stack; this is the instruction executed when using the typeof() operator in C#.

While this operation is useful, we sometimes want its dual, which is to say, we want the metadata used in a subsequent instruction to depend on the object or type handle at the top of the evaluation stack. A related operation is supported on the CLR, namely virtual dispatch, which depends on the concrete type, but dispatch is not general enough to support all of these scenarios because the vtable is immutable.

Consider a scenario where you have an untyped object, like a reference to System.Object "obj", and you want to call into some generic code, like a method Foo<T>(T value), but pass the concrete type of obj for T, instead of System.Object. Currently, you must go through a laborious process where you call GetType() on the object to obtain it's type handle, then obtain the method handle via reflection or some clever CIL hackery, then call MethodInfo.MakeGenericMethod in order to instantiate the type argument on Foo<T>, and finally, you must perform a dynamic invocation via reflection or allocate a custom delegate of type Action<T> and perform a virtual call, even though the call is statically known.

Each step of this process is expensive, and it makes typeful programming on the CLR painful when working on the boundary between typed and untyped code. Many reflection problems, like serialization, become simpler once we're dealing with fully typeful code.

Consider a dual instruction to ldtoken called "bind" which takes a type handle obtained via GetType() and then binds the resulting type handle into the executing instruction stream for the subsequent generic call to Foo<T>. This instruction could be easily and efficiently supported by any VM. Some restrictions are clearly needed for this instruction to remain verifiable, namely that the type constraints required by the target method are a subset of the type constraints of the statically known type, but the verifier already performs this analysis, and all of the information needed is available at the call site.

Fully polymorphic functions like Foo<T> trivially satisfy this condition since it has no constraints whatsoever. Serialize<T>/Deserialize<T> operations are duals, and in fact exhibit exactly the same sort of fully polymorphic type signature as Foo<T>.

There are many more programs that exhibit this structure, but they are unnecessarily difficult to write due to these asymmetries in CIL. This unfortunately requires developers to write a lot of ad-hoc code to get the results they want, and more code results in more bugs, more time, and more headaches.

I've recently completed some Sasa abstractions for safe reflection, and an IL rewriter based on Mono.Cecil which allows C# source code to specify type constraints that are supported by the CLR but unnecessarily restricted in C#. In the process, I came across another unjustified decision regarding verification: the jmp instruction.

The jmp instruction strikes me as potentially incredibly useful for alternative dispatch techniques, and yet I recently discovered that it's classified as unverifiable. This seems very odd, since the instruction is fully statically typed, and I can't think of a way its use could corrupt the VM.

In short, the instruction performs a control transfer to a named method with a signature matching exactly the current method's signature, as long as the evaluation stack is empty and you are not currently in a try-catch block (see section 3.37 of the ECMA specification).

This seems eminently verifiable given a simple control-flow analysis, an analysis which the verifier already performs to verify control-flow safety of some other verifiable instructions. If anyone can shed some light on this I would appreciate it.