Thursday, December 10, 2009

I've been experimenting with some purely functional collections, also known as persistent collections, and began to wonder precisely how much overhead they incur as compared to the standard framework collections on a VM platform such as .NET.

My open source Sasa library has long had a persistent stack type, considered a list in functional languages, and I was considering adding some additional persistent collections. I therefore wanted a better understanding of the various costs involved.

I had also wanted to test an interesting design alternative to standard class-based collections. One of the consistent nuisances on widely deployed VM platforms is the widespread presence of null values. C#'s extension methods somewhat mitigate this problem since you can call extension methods on null values and properly handle it, but calling instance methods on a null value throws a NullReferenceException. Therefore, you cannot invoke ToString or Equals without first checking for null.

This is particularly annoying for data types where null is in fact a valid value, as it is in linked lists, where null denotes the empty list. The idea I had was to make the data type a struct, wrapping an inner reference that would hold the actual collection contents, basically moving the null into a non-nullable type that I control:

A struct type can never be null, so I can overload Equals and ToString behaviour, but still handle null as a valid value -- a null inner reference indicates an empty list, for example.

I copied over the Sasa Seq type, implemented a struct-based version of it, and similarly implemented two persistent versions of a queue type. I then ran some quick benchmarks against the system collection classes, List<T>, Stack<T> and Queue<T> using a random sequence of 10,000,000 enqueues and dequeues. This test was repeated 10 times in each run, and the average runtime and memory use was taken.

This process was then repeated 13 times for each collection type, and I threw away the top and bottom two values. The results follow, where all values are in CPU ticks measured by System.Diagnostics.StopWatch. The results are displayed in sorted order, with the fastest at the top, and slowest at the bottom.

Queue Operations

Legend:

PQueue<T>: persistent class-based queue.

LinkedList<T>: System.Collections.Generic.LinkedList.

PQueue2<T>: persistent struct-based queue.

Queue<T>: System.Collections.Generic.Queue.

PQueue<T>

LinkedList<T>

PQueue2<T>

Queue<T>

7699324462.55

7890322380.36

7679249692.36

7761780264

7718837154.91

7983445563.64

7687912510.55

7803131562.18

7740207043.64

8023998067.64

7703769050.18

7810452602.91

7828445642.91

8032055720.73

7800245839.27

7811040914.91

8451346377.45

8201663652.36

7834188839.27

7850104080

8589153123.64

8282536765.09

7955433135.27

7935923176

8707905445.82

8484668378.91

8178008811.64

8118527255.27

8764140750.55

8742958103.27

8323106361.45

8348756291.64

8979406774.55

8752582049.45

8327369725.09

8350023904

Averages

8275418530.67

8266025631.27

7943253773.9

7976637783.43

Stack Operations

Legend:

Seq<T>: persistent class-based stack.

List<T>: System.Collections.Generic.List.

Seq2<T>: persistent struct-based stack.

Stack<T>: System.Collections.Generic.Stack.

List<T>

Seq<T>

Stack<T>

Seq2<T>

7710263064

7826307262.55

7802069434.91

7690359998.55

7762791045.82

7870793184

7808408813.82

7770721440.73

7849427981.09

8035572504.73

7840059360

7989185347.64

7905690538.18

8099787029.09

7883823349.09

8201203589.82

7912993893.82

8315915441.45

7907107597.82

8208994094.55

7947893535.27

8349995466.18

7914102997.82

8239609125.82

7952697770.18

8396224841.45

7927640684.36

8352575028.36

8131449759.27

8694619845.82

8155345951.27

8376635294.55

8388583725.09

8738260760

8294985193.45

8540294446.55

Averages

7951310145.86

8258608481.7

7948171486.95

8152175374.06

Analysis

The persistent collections perform rather well, generally within 5% of their mutable counterparts given this set of test data. While I didn't calculate it, you can see from the variance in the runs that there's a wider standard deviation from the mean for persistent collections, so their performance is ever so slightly less predictable, though not drastically so.

Interestingly enough, the struct-based persistent collections outperformed the class-based versions. I expected the reverse considering struct operations are not always properly optimized by the JIT. Even though the entire struct would fit into a register, the JIT may still allocate a stack slot for it, which would be more expensive than the guaranteed register-sized operations of a class type. Upon further thought, I suspected that perhaps the struct versions are faster simply because the VM doesn't need to perform a null check on dispatch, but the class-based versions use all-static method calls which don't perform null checks as far as I know.

I don't yet have a good explanation for this, but given the results, I believe I will move all Sasa collections to struct implementations as it simply provides more flexibility, immunity from null errors, and no appreciable runtime overhead.

If I had more time I would make these tests a little more rigourous by varying the test vectors in a more controlled fashion instead of just random, ie. use stepped, random, and other types of enqueue/dequeue sequences, to determine exactly how persistent and mutable collections behave based on inputs. I suspect the performance profiles will differ more drastically in such different scenarios.

This test was enough to demonstrate to me that persistent collections are sufficiently performant to be used in daily code, particularly given their numerous other advantages.

The above is just the interface, minus the comments which you can see at the above svn link. This struct tracks path components for you and C#'s operator overloading lets you specify paths declaratively without worrying about combining paths with proper separator characters, etc.:

The library will generate path strings using the platform's directory separator.

One significant departure from the Haskell library is the lack of phantom types used to distinguish the various combinations of relative/absolute and file/directory paths. C# can express these same constraints, but I intentionally left them out for two reasons:

The type safety provided by the file/directory phantom type is rather limited, since it's only an alleged file/directory path; you have to consult the file system to determine whether the alleged type is actually true. The only minor advantage to specifying this in a path type, is as a form of documentation to users of your library that you expect a file path in a particular method, and not a directory path. I could be convinced to add it for this reason.

The relative/absolute phantom type seems a bit useless to me, though the reason may be less obvious to others. IMO, the set of file and directory info classes should not have GetParent() or support ".." directory navigation operations, nor should they support static privilege escalation operations like File.GetAbsolutePath() which ambiently converts a string into an authority on a file/directory, since this inhibits subsystem isolation; without GetParent() or privilege escalation functions, any subsystem you hand a directory object is automatically chroot jailed to operate only in that directory. This has long been known to the capability-security community, and is how they structure all of their IO libraries (see Joe-E and the E programming language). Thus, in a capability-secure file system library, all paths are relative paths interpreted relative to a given directory object. As a result, FsPath also strips all "." and resolve all ".." to within the provided path to supply this isolation.

It goes without saying that .NET does not currently handle files and directories this way, but I plan to handle this eventually as well. FsPath will play a significant role in ensuring that paths cannot escape the chroot jail.

As an interim step along that path, the FsPath interface is a good first step.

Monday, November 16, 2009

I just completed a statically typed Pratt-style single-state extensible lexer+parser, otherwise known as a top-down operator precedence parser, for the Sasa library. The implementation is available in the Sasa.Parsing dll, under Sasa.Parsing.Pratt. Two simple arithmetic calculators are available in the unit tests.

This implementation is novel in two ways:

Aside from an alleged implementation in Ada, this is the only statically typed Pratt parser I'm aware of.

Pratt parsers typically require a pre-tokenized input before parsing semantic tokens, but I've eliminated this step by using the symbol definitions to drive a longest-match priority-based lexer.

Symbols by default match themselves, but you can optionally provide a scanning function used to match arbitrary string patterns. The symbol selected for the current position of the input is the symbol that matches the longest substring. If two symbols match equally, then the symbol with higher precedence is selected.

The design was heavily influenced by this Python implementation, though numerous departures were made to restore static typing and integrate lexing. In particular, symbols and semantic tokens are separate, where symbols are used primarily during lexing, and construct the appropriate semantic tokens on match.

The above is a simplified version of the calculators used in the unit tests. You can clearly see the definition of left and right-associative infix operators, and postfix operators. The numbers are precedence levels, and the delegates are the semantic actions associated with each token type.

There are pre-defined functions for creating left and right-associative infix, prefix, postfix, and both prefix and infix ternary operators -- prefix ternary is a simple "if _ then _ else _", infix ternary is like C's ternary operation, "_ ? _ : _". You are not constrained to these symbol types however, as you can define arbitrary character sequences as symbols and parse the results at your whim.

Adding variable binding to the above calculator requires a few extensions to support lexically scoped environments, but the parser requires only the following two extensions:

"let" is not matched as an identifier because the "let" symbol has a higher precedence. Parser definitions with ambiguous parses thus require some care to resolve the ambiguity either via match length or precedence relationships.

There are two popular choices when it comes to parsing: parser generator tools and parser combinator libraries. I've never liked generators personally, as they require a separate language and entirely separate tools to define and process the grammar thus increasing the learning and maintenance curve.

I actually like parser combinators, and there are now good techniques for supporting arbitrary left recursion in grammars (see Packrat parsing in OMeta). However, handling operator precedence often involves a series of complex extensions to the otherwise simple recursive descent, or it requires the parsed results to maintain all operator and grouping information so a post-processing phase can readjust the parse tree to reflect the proper priorities. The post-processing phase typically uses a shunting yard or Pratt-style operator precedence algorithm to resolve precedence, so why not just build the parser itself directly on Pratt parsing?

Unlike parser-combinator libraries, of which I've built a few, Pratt parsers natively support arbitrary operator precedence and are efficient as they require no backtracking. Pratt parsers are also Turing-complete, because the semantic actions can perform arbitrary computations on the parser input. Turing completeness carries both benefits and costs of course, but the single-state nature of Pratt parsers keeps it manageable. The "single state" of the parser is simply the current token matched from the input.

For example, if prefix-ternary operators weren't already available, you could simply define one like so:

The above builds on "Prefix", but you could break this down even further as in Pratt's original paper and define arbitrary symbol behaviour by providing a scanning function, "left denotation", and "null denotation", and so on.

To define a parser, you need only inherit from Sasa.Parsing.Pratt.PrattParser, provide the semantic type T, and override the constructor for your parser to define your symbol table. The only downside is that each parser instance is monolithic and not thread-safe, so it can only parse one input at a time. There is no limit to how many parsers you can create however.

Pratt parsing is simple, compact, and efficient, so I hope it finds more widespread use, particularly now that there is a statically typed implementation for a popular language.

[Edit: the PrattParser will be available in the upcoming v0.9.3 release of the Sasa class library.]

Thursday, October 1, 2009

I've written quite a few times about my experiments with the CLR type system [1, 2, 3]. After much exploration and reflection, I had devised a pretty good approach to encoding ML-style modules and abstracting over type constructors in C#.

The best encoding of ML modules and type constructor polymorphism requires the use of partly safe casting.

An ML signature maps to a C# interface with a generic type parameter called a "brand". The brand names the class that implements the interface, ie. the module implementation.

An ML module maps to a C# class. If the module implements a signature, then it implements the corresponding interface and specifies itself as the signature's brand.

Since classes and interfaces are first-class values, an ML functor also maps to a class.

An ML type component maps to an abstract class that shares the same brand as the module. This effectively ties the the module data representation and the module implementation together at the interface boundary, and makes the necessary casting partly safe.

Programs written against tagless interpreters are wrapped in functors in order to properly abstract over the interpreter implementation. As mentioned before, modules and signatures are effectively first-class values in this encoding, so a functor simply becomes a function:

The brand/ISymantics type could also be lifted to be a generic class parameter to make it syntactically closer to how it looks in the paper, but the difference is not important.

You can now run EX.Test1 with any conforming implementation of ISymantics, and the type system will prevent you from mixing representations of different implementations just as it would in ML, because the brands will not match. The only way to trigger a type error due to the downcast, is if the client implements his own Repr<T, B> supplying R for the brand, then passing the custom Repr type in to a method on ISymantics<R>. In such cases the client deserves an error.

I think this is a fairly reasonable trade off all things considered. Of course, it would be preferable if the CLR could just support type constructor polymorphism natively. And while all my wishes are coming true, can I have all of these changes too?

There are specific overloads for all System.Action delegate types. A general RaiseAny extension method is also provided that isn't very efficient, but which should match the signature of any event in the class libraries. More overloads can be provided for more efficient execution if needed.

Here is the complete list of changes from the changelog:

= v0.9.2 =

* fixed bug where quoted printable encoding failed when = was the last character.* added Pop3Session.Reset method.* compact serializer no longer uses stream positions to track cached objects, so non-indexable streams, like DeflateStream, are now usable.* ISerializing interface generalized to support serializing non-primitive field values.* added e-mail subject decoding.* fixed boundary condition on QuotedPrintableEncoding.* added extension methods to support safely raising events.* added an extension method to safely generate field and property names as strings.* added parameter to Compact serializer to indicate whether SerializableAttribute should be respected.* added a boundary check for SliceEquals.* added a event Raise overload for any event handlers whose arg inherits from EventArgs, which enables safe event raising within an object.* MailAddressCollection already handles parsing comma-delimited e-mail address strings, so don't attempt to split them manually.* added a test for an e-mail address that contains a comma.* default to us-ascii charset if none provided.* NonNull now throws a ArgumentNullException with a proper description.* many FxCop-related fixes and CLS compliance improvements.* added usage restrictions on Sasa.CodeContracts attributes.

[Edit: generating HTML docs from the XML files used to be such a chore, though Sandcastle at least makes it possible, if convoluted. Mono's mdoc util is barely any help here, as it can only process one Visual Studio generated XML file at a time. I just found ImmDoc.Net which made this process ridiculously easy and stupendously fast, easily an order of magnitude faster than Sandcastle, but it looks like there's a bug handling method ref and out parameters.

Sunday, August 16, 2009

I use the very respectable SlimTimer to help me track my hours. Unfortunately, while they display a consistent total across all reports, the entries in each report do not necessarily add up to that total due to the fractional time units and the rounding involved.

Unfortunately, the accounting department tends to frown upon inconsistencies like this, no matter the reason. My process thus far has been to simply export a full timesheet with the report settings specifying time units as precisely as possible, and then performing the sum myself on the resulting chart.

This got a bit tedious, so I wrote a program to compile the necessary tallies over however many timesheet files I wanted to process. The source and binaries are available for download. Simply drag and drop any number of timesheets generated from SlimTimer, and the utility will generate a new csv file for each timesheet, with an extension "-out.csv".

The format of the output csv file is formatted for my invoice structure, but adjusting it for other formats is fairly trivial for anyone familiar with C#.

Tuesday, June 23, 2009

Awhile back I described an idea to transparently execute code server-side or client-side given the same program. I've finally gotten around to implementing this using my encoding for type constructor polymorphism in C#.

Here is an example you can run from the original paper showcasing exponentiation which can execute transparently either server-side or client-side. The server-side code is intentionally limited to bases less than 100 and exponents less than 20.

The code-behind for the .aspx page interprets this code twice, once with a LINQ-based compiler for server-side execution, and once with a JavaScript backend to generate the client-side program.

The client-side program depends only one utility function to compute the fixed point of a given function:

function fix(f) { return function self(x) { return f(self)(x); };}

Clearly any programs written this way are fairly verbose, but it's fairly straightforward to define wrapper structs around E<T, B>, such as Eint, Ebool, etc., to provide overloaded operators such that embedded programs look almost like ordinary C#. The current ISymantics interface is also limited to single-arg functions, but it's possible to add more overloads to support multi-arg functions.

Of course, it's already possible to translate C# expressions to JavaScript if we write a LINQ expression visitor. However, LINQ expressions are in a sense more limited than operating on this lifted data type. For instance, it's not possible to translate a static function to JavaScript using LINQ expressions, but it is possible using the final tagless representation.

It's an open question whether such final tagless interpreters can be made usable enough for real programming. However, the techniques employed here, in particular the structure which enables semi-safe type constructor polymorphism, are definitely useful for other projects that require retargetable abstractions.

The full source code is available, and includes a simple C# evaluator, LINQ-based compiler, a JavaScript backend, and an interpreter that computes the depth of the expression tree, which was also an example in the tagless paper. Also included is a partially working partial evaluator. This proved the trickiest to implement, and the only remaining problem appears to be recursive partial evaluation.

As future work, the partial evaluator can be parameterized by the type of compiler, so it can partially evaluate managed code and JavaScript programs; it is currently needlessly tied to the managed code interpreter. I also plan to implement the aforementioned wrapper structs and other enhancements to improve the clarity of embedded programs, hopefully to the point where they are usable for real programming.

[Edit 2009-08-16: it just came to my attention that the JS failed on the first run, but succeeded after a server-side execution was performed. This was a DOM manipulation bug, and the source and server code have been updated to fix this.]

MIME MailMessage parsing and the Pop3Client are already in use in production code, and conformance appears adequate after hundreds of processed messages.

Experimental Additions

There are a few components of this release that I would deem "experimental".

Sasa.Dynamics is intended as a "safe reflection" facility. Basically, this is a "type-case" construct as found in the "intensional type analysis" literature. Any reflective algorithm should be implementable via ITypeFunc<R>, and you cannot forget to handle a particular case. This interface basically factors out object traversal from the actual reflection algorithm.

Under Sasa.Web and Sasa.Web.Asp, I've included a URL-safe Base64 encoder/decoder, Sasa.Web.Url64, and a generic ASP.NET Page base class that is immune to clickjacking and CSRF.

I first proposed this idea on the capability security mailing list. I'm not completely satisfied with the current incarnation, but it's an interesting idea I'm still toying with.

Sunday, March 22, 2009

The tagged pointer representation described in Xavier Leroy's ZINC paper is compelling in its simplicity. Most data manipulated in typical programs is integer or pointer-based in some way, so using a 1-bit tagged representation allows unboxed integers, resulting in an efficient representation of the most common data types.

I've never been satisfied with the size header in heap-allocated types though, and the two bits reserved for GC pretty much forces you to use some mark-sweep variant. Integrating something like age-oriented collection with reference counting for the old generation would require an entirely new word for every block of memory. This is a prohibitive cost in functional programs.

Removing the Size Header

Reading about BIBOP techniques used in the Streamflow allocator gave me the idea of using page masking to determine the block size.

As in Streamflow, consider an allocator where each page obtained from the OS was partitioned into a list of equally-sized blocks. Each of these blocks does not need its size stored in an object header, since the size of the block is a property of the page it was allocated from, and we can easily obtain the page address by simply masking the pointer to the block. So the first block of every page is dedicated to storing the size, and all subsequent blocks are placed in the free list.

This works for structured data allocated in fixed-sized blocks, but unstructured data is dynamic in extent and can span pages. Unstructured data must also carry the GC header in the first word, even when spanning pages. However, we know that arrays and strings are the only primitive unstructured data described in ZINC, and they must now both carry their size information as part of the structure. We can thus easily identify "small" strings and arrays that could fit into a fixed-sized block.

As a possible space optimization, such "small" arrays and strings don't need to be accompanied by a size field. We can perform a simple test to distinguish "small" arrays: if the array pointer is more than one word off from a page-aligned address, it is structured data, since unstructured data that spans pages always starts on a page address + GC header size.

We now have a full 24 free bits in the block header, which we can reserve for use by the GC. 24 bits is enough to employ reference counting, or a hybrid mark-sweep for the nursery, reference counting for the old generation as in age-oriented collection. The GC to employ is now completely at the discretion of the runtime, and can even be changed while the program is running.

The Problem

There is a downside of course. Streamflow allocates fixed-sized blocks for sizes ranging from 4 bytes up to 2kB. Considering the above approach uses the first block to store the block size, we would waste up to 2kB on the larger block sizes. We could amortize this cost by allocating more pages and spreading the cost across them, but then we lose the simple page masking. We could avoid large fixed-sized blocks, maybe cutting them off at 512 bytes, but then we would still need some way to store the size for these "medium-sized" blocks.

We can't simply use a single word of the page to store the size, as we would then have an odd number of bytes to divide into fixed size blocks. Again, it's only a problem for larger block sizes. You can't split 4092 bytes into two 2kB blocks.

Another Approach

As a general solution to these problems, we can eliminate the size tag on the page entirely by maintaining a sorted array of address ranges and the block size of the range. When we need to discover the size of a block, we perform a binary search to find the address range the address falls in, and extract the size. This operation is O(log n), where n is the number of distinct address ranges, ie. sequences of consecutive pages allocated to the same block size. I would expect the number of such page regions to be under 50, so the logarithmic cost should be very low in practice, though the additional indirections accessing the array can be costly due to cache misses.

Summary

A block header representation that provides for GC independence would be compelling. The page-level size header is promising, but clearly has problems. Hopefully, some clever person will devise a general way to handle the larger fixed sized blocks without wasting too much space.

The solution to the problems with the page-level size header using a binary search is a mixed blessing. The additional runtime cost in discovering block size may be prohibitive, since it must be performed during GC traversal, and on every free operation. The costs of free might possibly be amortized somehow when done in bulk, but the cost incurred by GC traversal seems much more challenging to eliminate.

Tuesday, February 24, 2009

My .NET Sasa library has fallen by the wayside as I experimented with translating various functional idioms in my FP# library. Reading up on what a few other generic class libraries have been experimenting with, like Mono Rocks, spurred me to putting those experiments to use and updating Sasa. I significantly simplified a lot of the code, documented every class and method, and generalized as much as possible. The license is LGPL v2, and you can download the source via svn:

svn co https://sasa.svn.sourceforge.net/svnroot/sasa/tags/v0.8 sasa

Sasa Core v0.8

A set of useful extensions to core System classes and some useful classes for high assurance development.

Named tuple types: Pair, Triple, Quad.

Either types, representing one of many possible values. There are Either types for 2, 3, and 4 parameters, mimicking the Pair, Triple, and Quad structure. Tuples are "product" types, while Either is a "sum" type, and products and sums are duals. Since products are useful, I figured variously sized sum types might also find some uses. Time will well.

Sasa.Linq

IdentityVisitor which provides default implementations for all NodeTypes and performs no modifications to the expression, just returning it as-is.

ErrorVisitor which which throws NotSupportedException for all NodeTypes.

Sasa.Serialization

A stand-alone assembly with serialization classes.

Provides a compact serializer which requires only ReflectionPermission, and not SecurityPermission like the System classes do; this serializer can therefore be used in medium trust environments. The serializer currently requires a little more discipline from the developer to use correctly, but space savings of 100-200% are typical.

An experimental unsafe, highly compact binary serializer.

Sasa.Net

A library providing missing functionality under System.Net.

A Pop3Client class.

MailMessage parsing.

Sasa.CodeContracts

Microsoft Research is developing a design by contract library which they hope to release with .NET 4.0. It's a fairly sophisticated piece of software, that integrates with a static verification tool called Pex. The analysis tools can detect contract violations at compile-time, and even generate test cases for each violation.

Unfortunately, their license forbids commercial application of the pre-release library, even if you just want to utilize runtime contract checking.

Sasa.CodeContracts is a Microsoft API-compatible implementation of the CodeContracts library. This is only a runtime library, and does not provide the Pex integration with static analysis and automated test generation.

Precondition checking is enabled, but postconditions and object invariants require CIL re-writing, so they are not currently supported. I will be looking into using Mono.Cecil to rewrite the IL to support post-conditions and invariants in the future.

TODO for v1.0

There are a few items remaining before v1.0 is released, but the library is usable as-is. Notably missing is MIME parsing for MailMessage, which will be added for v1.0. Also serialization will get improved safety almost on-par with standard framework serialization, and the compaction will be user-customizable for even more space savings in any given program.

Future Work

The Sasa API is fully documented with accompanying XML for code completion. Comments on the clarity of the API and documentation are welcome! Some tutorials on using these features safely are coming as well.

I'm dissatisfied with a few other approaches being pursued on the CLR, including:

Current approaches to parallel and concurrent programming, even Microsoft's Parallel Extensions and the Concurrency and Coordination Runtime.

CLR security is far too coarse-grained and pretty much unusable.

Efficient async I/O is too difficult to reason about (though the CCR does make it easier).

In lieu of a Pex static analysis, there is the possibility of QuickCheck-like test suites derived from CodeContract annotations.

Unfortunately, there is a "common wisdom" in the .NET world that an "is" test followed by an "as" cast is performing two casts, and in fact one should simply perform the "as" cast then check the result against null:

// to this form:if (o is string){ string a = o as string; Console.WriteLine(a);}

In fact, that's not the case, as any compiler worth its salt will coalesce the two tests into a single cast and branch operation. I took the tests from the above dispatching and altered them to perform the cast-and-null check, then I ran the tests again with the original is-then-as form. The latter form was about 6% faster on every timing run.

There's obviously some optimization being done here, but the lesson is: don't try to outsmart the compiler. In general, just write code the safe way and let the compiler optimize it for you. It's safer to perform a test then cast within a delimited scope like an if-statement, than to let the possibly null variable float around in the outer scope where you might use it inadvertently later in the method or during refactoring.

If performance turns out to be an issue, profile before trying these sorts of low-level "optimizations", because you might be surprised at the results.

Thursday, January 29, 2009

Awhile ago, I had posted about an embedding of a stack language in C#. The type signatures of the functions and the stack object encoded the consumption and production of stack values, so if your program compiled, it ran correctly.

Unfortunately, the prior structure had a safety problem when generating code which I noted, but didn't have time to address.

The new structure provided below does not have the safety problem, and any functions that compile are guaranteed to execute correctly. I have also altered the style to emphasize the row variable representing the "rest of the record" which the operation knows nothing about. The row variable is denoted by "_".

This is still a fairly limited embedding, but I have added a few convenience functions, and may yet add more. Here is a sample program: