Sasa.Enums

Sasa.Enums provides a statically typed API for working with enums, analogous to the dynamically typed System.Enum. Every method call in System.Enum that accepts a System.Type representing the enum type, here accepts a type parameter that is constrained to be of type enum.

Sasa.Enums.HasFlag

The Sasa.Enums.HasFlag extension method checks for the presence of flag bits set in an enum that has the FlagsAttribute applied:

Sasa.Enums.Names

The Sasa.Enums.Names static method provides a enumerable sequence of strings corresponding to the enum's descriptive names. Essentially, this is the equivalent of System.Enum.GetNames, except it's statically typed with an enum constraint, and it returns a cached immutable sequence so it avoids the overhead of allocating an array for every call:

Sasa.Enums.Values

The Sasa.Enums.Values method provides all the underlying values of the given enum type. Like the Enums.Names method, it returns a cached immutable sequence so it avoids the overhead of allocating arrays for every call:

Saturday, June 29, 2013

I've just uploaded the final Sasa v0.9.4 release to Sourceforge and Nuget. The full API documentation for all assemblies in the Sasa framework is available here. The full changelog is available in the Sourceforge download, or directly in the repo here. Suffice it to say, changes since v0.9.3 include hundreds of bugfixes, and many, many new features.

A brief overview of what Sasa is, and what features it provides is covered on the wiki, and reproduced below.

Sasa Class Libraries for .NET

Sasa is a set of organized class libraries for the .NET framework 3.5 or higher. Here's an overview of the assemblies available and the features they provide:

After this post, I will have covered everything in the core Sasa assembly, Sasa.dll. The first two posts covered Sasa.Parsing.dll and Sasa.Dynamics.dll, and now that the core assembly has been covered, I have only a few final cleanups of the codebase before I release v0.9.4 final on Sourceforge and Nuget.

Of course, there still remains Sasa.Concurrency, Sasa.Binary, Sasa.Collections, Sasa.Numerics, Sasa.Linq, Sasa.Mime, Sasa.Net, and Sasa.Reactive to document, at the very least. The world of Sasa is far larger than what's been covered so far, so there's plenty left to explore! I will continue to write periodic blog posts or other sorts of documentation covering these assemblies, but I won't hold up the v0.9.4 release any further.

Sasa.Lazy<T>

.NET 4.0 was released with a Lazy<T> type, although the one in Sasa predates this one by quite a bit and is somewhat simpler. In principle, there is little difference between a Lazy<T>, a Task<T>/Future<T> and a reactive value, React<T>, like that found in the Sasa.Reactive assembly. In fact, the latter largely generalizes the semantics of the previous three since you can easily register for notifications and construct chained computations ala promise pipelining. As such, Sasa.Lazy<T> may one day be replaced by React<T> or something even more general. But it's available in the meantime, it's simple, and it's well tested in production environments.

Sasa.Lazy<T> Interfaces

The lazy type implements the following core interfaces: IValue<T>, IOptional<T>, IResolvable<T>, IVolatile<T>, and IResult<T>. To summarize, this set of interfaces exports the following methods and properties:

// starts the lazy computation, if not already
// run, and returns the computed value
T IRef<T>.Value { get; }
// returns false if the computation has not yet started
// or it returned an error of some kind
bool IResolvable<T>.HasValue { get; }
// returns false if the computation has not yet started
// or it returned an error of some kind and sets
// 'value' to the encapsulated value
bool IVolatile<T>.TryGetValue(out T value);
// provides the exception generated by the lazy computation
// if any was generated; if an exception was generated, then
// HasValue and TryGetValue both return false
Exception IResult<T>.Error { get; }

Sasa.Values

Sasa.Values is a static class intended to encapsulate a set of useful methods defined for all values. It currently exports only a single overloaded extension method.

Sasa.Values.IsIn

Sasa.Values.IsIn is an overloaded extension method used to check if a value is contained within a collection of other items. Logically, it's equivalent to SQL's "IN" operator. In it's most general form, IsIn is a shorthand for Enumerable.Contains, but the more specific overloads are far more efficient and don't require the construction of a collection to check membership:

As explained before, you can already perform this check in standard C#, but it's far more verbose. You either have to create a switch statement, or a set of logical conjunctions in an if-statement, or you have to construct a collection and call Contains like so:

As you can see above, there are two overloads, one accepting a simple value, and one accepting a function that produces a value for the cases where the default value may be expensive to produce.

Note that there may be a few more abstractions or extension methods in the core Sasa dll that weren't entirely documented in this series, but these are abstractions whose API isn't entirely satisfactory, and may soon be refactored or removed. For instance, this is the case with Dictionaries.FindOrOtherwise, which has been in Sasa for quite awhile, and which I've used in some production code, but will probably be replace by the more genreal FindOrDefault.

It's well known that C# forbids certain core types from appearing as constraints, namely System.Enum, System.Delegate, and System.Array. Sasa.TypeConstraint are two purely declarative abstract classes that are recognized by Sasa's ilrewrite tool, and which permit programs to specify the above forbidden constraints. This pattern is used throughout Sasa to implement functionality allowed by the CLR but that would normally be forbidden by C#.

There are two definitions of TypeConstraint, one with only a single type parameter, and one with two type parameters. The second extended definition is unfortunately required to express some corner cases. It's primarily used in Sasa.Operators, and it's generally only needed if you're going to use some methods defined with TypeConstraint within the same assembly. If you can factor out those constrained methods into a separate assembly, you shouldn't ever need it.

Sasa.TypeConstraint.Value

The Sasa.TypeConstraint.Value property allows code compiled assuming a value of type TypeConstraint<Foo> to access the underlying value of type Foo. The ilrewrite tool erases calls to this property as well, leaving the access to the raw value. The following example is actually the definition of Sasa.Func.Combine:

TypeConstraint also defines implicit conversion from T to TypeConstraint<T> and TypeConstraint<T, TBase>, so you should never have to construct such an instance manually. If you forget to run ilrewrite on an assembly, attempting to construct or access any members of TypeConstraint will throw a NotSupportedException naming the assembly that needs rewriting.

Any options not listed above will simply be ignored by ilrewrite. The /verify option runs the platform's "peverify" tool to ensure the output passes the CLR verification tests. The /key option additionally creates strongly named assemblies from the given key file. This way you can keep strong names, you just need to defer signing to ilrewrite.

I should also note that this is how Sasa is complying with the LGPL while also providing strongly named assemblies with a privately held key. The LGPL stipulates that Sasa users ought to be able to replace my assemblies with their own whenever they wish, and this is possible using ilrewrite. An assembly that was built against my Sasa.dll simply needs to pass through ilrewrite that's given a different Sasa.dll signed with another key, and the output assembly will then be bound to the new assembly and key. After a brief exchange with a associate member of the EFF, these terms seemed satisfactory, although I should note that he isn't licensed to practice law, and his opinion does not constitute an official EFF response on this issue.

To integrate ilrewrite into my VS builds, I simply place ilrewrite in a bin\ folder under the root folder of my solution, then add the following line to my post-build event:

When running a DEBUG build, $(ConfigurationName) specifies the /DEBUG option, and any other configuration specifies an unknown option that ilrewrite simply ignores.

Sasa.Raw: Building on Sasa's Constrained Operations

From time to time, it may be necessary to build upon the constrained operations provided by Sasa, ie. defining a generic but different Func.Combine from the one above, but which takes some additional parameters. Here you'll run into a little trouble because you'll need to specify TypeConstraint<T> on your function, but the rewritten function in Sasa.dll is expecting a T. For this reason, Sasa also ships with Sasa.Raw.dll, which is Sasa.dll prior to rewriting and signing. This means all the TypeConstraint IL is intact, and you can write your extensions as needed.

The step by step procedure is:

Link to Sasa.Raw.dll, not Sasa.dll, when building your project.

In the post-build event, delete Sasa.Raw.dll and copy over Sasa.dll.

In the post-build event, run ilrewrite as you normally would.

This is exactly the same procedure you'd follow when replacing my Sasa release with someone else's. For instance, here's the post-build event for Sasa.Reactive:

Sasa.IO.DisposableFile is a simple object intended to automate the handling of temporary files. Like most IDisposable objects, a DisposableFile is intended to be used within a "using" block, and the file it references is deleted upon exiting the block:

Initially the benchmarks were disappointing, but 2 minutes with the profiler revealed the problem was the Tree.Add method, which was using a very simple but poor implementation. Basically, it was checking if the key was already in the tree before attempting to update, thus performing the traversal twice for every addition. I refactored this to share the same implementation as Tree.Update which performs only a single traversal, and the results are now more reasonable.

The benchmarks were run on an FX-8120 performing 200,000 individual inserts and 200,000 individual membership checks on a set of unique integers, ie. treating the dictionaries and trees as a set. The inserts were separately clocked from the membership tests.

Insertions into the HAMT appear to be about ~15x slower than insertions into the mutable dictionary when averaged over the bulk insert benchmark. There is a way to perform bulk inserts much more efficiently, but it wouldn't give a sense for incremental update costs.

Membership tests are ~2x slower for the HAMT, which seems like it's in the right ballpark for an initial implementation. The HAMT also uses a little less than twice the memory of the mutable collections according to the memory statistics after forcing a full GC.

According to profiling data, about 40% of the time in the insert benchmark is spent allocating new arrays, so there doesn't seem to be much room to improve the runtime of updates except perhaps by reducing allocations. I believe Clojure's persistent vectors implement some optimizations here, but I haven't had the need to dig into their implementation.

Lookup costs seem almost entirely related to virtual dispatch overhead while performing recursive lookups on sub-trees. About 45% of the time in the lookup benchmark is spent there. I'm not quite sure how to reduce this overhead, except perhaps to eliminate the class hierarchy that defines the tree structure and use faster type tests and casts. I'm not convinced it would make that much of a difference, but perhaps I'll give it a try if I'm bored some day.

If anyone has any suggestions or pointers to a simple explanation of Clojure's tricks or some other HAMT optimizations, please let me know!

Many abstractions in Sasa provide purely functional semantics since such features tend to be absent in .NET's base class libraries. Sasa.Collections.Arrays is then exactly what it sounds like: a purely functional interface for manipulating one-dimensional arrays. The exported API includes extension methods for slicing, setting slots, appending, inserting and removing elements, all without mutating the original array.

I don't recall the original inspiration for this API, but it was likely some combination of APL and concatenative languages like Forth and Factor. The interface is also not necessarily complete, but it's sufficiently complete that I could relatively easily implement a hash-array mapped trie.

Sasa.Collections.Arrays.Append

The Arrays.Append extension method creates a new array with a new value appended to the end: