Tuesday, February 24, 2009

My .NET Sasa library has fallen by the wayside as I experimented with translating various functional idioms in my FP# library. Reading up on what a few other generic class libraries have been experimenting with, like Mono Rocks, spurred me to putting those experiments to use and updating Sasa. I significantly simplified a lot of the code, documented every class and method, and generalized as much as possible. The license is LGPL v2, and you can download the source via svn:

svn co https://sasa.svn.sourceforge.net/svnroot/sasa/tags/v0.8 sasa

Sasa Core v0.8

A set of useful extensions to core System classes and some useful classes for high assurance development.

Named tuple types: Pair, Triple, Quad.

Either types, representing one of many possible values. There are Either types for 2, 3, and 4 parameters, mimicking the Pair, Triple, and Quad structure. Tuples are "product" types, while Either is a "sum" type, and products and sums are duals. Since products are useful, I figured variously sized sum types might also find some uses. Time will well.

Sasa.Linq

IdentityVisitor which provides default implementations for all NodeTypes and performs no modifications to the expression, just returning it as-is.

ErrorVisitor which which throws NotSupportedException for all NodeTypes.

Sasa.Serialization

A stand-alone assembly with serialization classes.

Provides a compact serializer which requires only ReflectionPermission, and not SecurityPermission like the System classes do; this serializer can therefore be used in medium trust environments. The serializer currently requires a little more discipline from the developer to use correctly, but space savings of 100-200% are typical.

An experimental unsafe, highly compact binary serializer.

Sasa.Net

A library providing missing functionality under System.Net.

A Pop3Client class.

MailMessage parsing.

Sasa.CodeContracts

Microsoft Research is developing a design by contract library which they hope to release with .NET 4.0. It's a fairly sophisticated piece of software, that integrates with a static verification tool called Pex. The analysis tools can detect contract violations at compile-time, and even generate test cases for each violation.

Unfortunately, their license forbids commercial application of the pre-release library, even if you just want to utilize runtime contract checking.

Sasa.CodeContracts is a Microsoft API-compatible implementation of the CodeContracts library. This is only a runtime library, and does not provide the Pex integration with static analysis and automated test generation.

Precondition checking is enabled, but postconditions and object invariants require CIL re-writing, so they are not currently supported. I will be looking into using Mono.Cecil to rewrite the IL to support post-conditions and invariants in the future.

TODO for v1.0

There are a few items remaining before v1.0 is released, but the library is usable as-is. Notably missing is MIME parsing for MailMessage, which will be added for v1.0. Also serialization will get improved safety almost on-par with standard framework serialization, and the compaction will be user-customizable for even more space savings in any given program.

Future Work

The Sasa API is fully documented with accompanying XML for code completion. Comments on the clarity of the API and documentation are welcome! Some tutorials on using these features safely are coming as well.

I'm dissatisfied with a few other approaches being pursued on the CLR, including:

Current approaches to parallel and concurrent programming, even Microsoft's Parallel Extensions and the Concurrency and Coordination Runtime.

CLR security is far too coarse-grained and pretty much unusable.

Efficient async I/O is too difficult to reason about (though the CCR does make it easier).

In lieu of a Pex static analysis, there is the possibility of QuickCheck-like test suites derived from CodeContract annotations.

Unfortunately, there is a "common wisdom" in the .NET world that an "is" test followed by an "as" cast is performing two casts, and in fact one should simply perform the "as" cast then check the result against null:

// to this form:if (o is string){ string a = o as string; Console.WriteLine(a);}

In fact, that's not the case, as any compiler worth its salt will coalesce the two tests into a single cast and branch operation. I took the tests from the above dispatching and altered them to perform the cast-and-null check, then I ran the tests again with the original is-then-as form. The latter form was about 6% faster on every timing run.

There's obviously some optimization being done here, but the lesson is: don't try to outsmart the compiler. In general, just write code the safe way and let the compiler optimize it for you. It's safer to perform a test then cast within a delimited scope like an if-statement, than to let the possibly null variable float around in the outer scope where you might use it inadvertently later in the method or during refactoring.

If performance turns out to be an issue, profile before trying these sorts of low-level "optimizations", because you might be surprised at the results.