Wednesday, August 20, 2008

Tiburon: fun with generics and anonymous methods

Update: I've added more code to the end of this entry.

The next release, Delphi 2009 (codename Tiburón) has some new language features: generics and anonymous methods in particular. I've decided to write up some of my experiments, playing with the new tools to see what can be done with them.

First, though, a disclaimer: I am of course working with the development tip for both compiler and RTL, so some details mentioned here (RTL in particular) may change before final release. However, the basics of generics and anonymous methods are pretty solid and should not change, so any gaps should be fillable if needs be.

Here's a generic method reference:

type
TFunc<T> = reference to function: T;

This guy (a "method reference") can refer to any global function, method or anonymous method which takes no parameters and returns a value. It's a managed type, like strings and interfaces, because it can contain state that needs to be freed when the last reference to it goes out of scope. Here's a trivial example of a method reference in use with an anonymous method:

Notice how the value returned from MakeCounter has captured its parameters, and that these parameters have become internal state of the return value. This action is called variable capture, and an anonymous method that performs variable capture is a closure. TFunc<T> is useful enough that it is predefined in SysUtils; there are similar versions taking up to 4 arguments, and there are also TProc method references, also taking up to 4 arguments.

This example suggests to me an easy way of creating enumerators in Delphi. Rather than having to define a separate enumerator class, and implement the GetCurrent and MoveNext functions, as well as a property that reads from GetCurrent, we can use an anonymous method that simply returns each value in sequence. To signal termination, we'll use a special exception. With these tools, we can write a counter generator that has an upper bound too:

Unfortunately, this has several drawbacks. First, the for-in syntax can't be used, because that expects a collection of some kind that has a method called GetEnumerator, whereas we only have the enumerator itself. Secondly, the enumeration can only be performed once. You can see this in the output:

We can fix both of these problems, however, by adding another layer of indirection. Rather than working with TFunc<T>, we'll work with TFunc<TFunc<T>>; and we'll write a TSimpleEnumerable<T> that returns our enumerator.

Now, inside one of the new units, Generics.Collections, are a couple of new base classes, TEnumerator<T> and TEnumerable<T>. I'll use those base classes too:

So there we have it: an easy way to create enumerators and simple collection wrappers using a combination of generics and anonymous methods. There's more that can be built on this foundation, but I'll leave that until later.

Update:

Joylon in the comments prompts me to show the alternative - i.e. how to write the above without using anonymous methods. Here's a basic enumerator which does the same thing as the above - i.e. generate an integer sequence of a specified length and stride:

At 35 lines, it is quite a bit (3x) more verbose than the directly equivalent MakeCounter function (11 lines; this isn't one that supports resetting or recreating the enumerator, but the translation above doesn't either):

However, even then, I wouldn't necessarily recommend using the above technique to implement all enumerators in general - the fact that exceptions are used to signal termination will cause some performance loss and, more importantly, quite a bit of annoyance when running in a debugger.

Rather, I intend, over the next few posts in this blog, to use anonymous methods in all sorts of ways so that you, the reader, might see opportunities for their creative use in your own code.

@Joylon: Most of the code is in the article is reusable library material. One of the problems with trying to show how to reduce effort in implementing A with some new applied abstraction f(A'), where A' < A, is that you have to introduce f(x); and it is typically the case with such abstractions that f(x) > A. Hopefully you can understand the example enough to see which bit is f(x) and which bit is A'. Probably the biggest problem with the post is that I never presented A, so you never saw the verbosity of that.

However, your complaint that all this does is list numbers, I find hard to take seriously. The point of the post is how to create an iterator without having to write a separate class (or two) and two (or three, if you include Reset) separate methods. Some imagination is required!

I think you missed my point Barry - I wasn't suggesting that you only eliminate anonymous methods in the implementation but that you instead consider the simplest possible implementation that would produce the same output (i.e. the desired result).

That simplest-possible implementation may not be AS re-usable (then again, ime people are more likely to re-use something they understand) but when assessing the value of re-use you have to consider not only how re-usable something IS but how likely and how often it is going to BE re-used and how much a re-usable implementation is actually going to save/improve anything.

Things haven't changed much in the last year or so - there is still a dearth of compelling, recognisably applicable examples to showcase these additions to the language. imho.

You're using weasel words when you talk about "the same output" - of course the case at hand is trivial, a listing of numbers. The case is trivial so that the mechanism is the only thing that is complex - otherwise there would be unnecessary complexity in the example.

"the simplest possible implementation" - would be a machine code listing, as that would be devoid of abstraction.

Or perhaps you have your own, biased, concept of simplicity? Maybe you use the word "simple" to mean "familiar to Joylon"? It seems that way to me, because I guess that you probably haven't internalized closures as a language construct.

I imagine that not everybody will be willing to limit their toolbox to only those tools "familiar to Joylon", however.

I give my support to Joylyon. Perhaps mr Kelly has forgotten that not all people who use Delphi are capable of writing compilers of their own? I agree wholeheartedly with Joylyon that the examples are complex, and I also think that this example didn't make me want to use the new features. I'm not saying the new features are no good, but the example didn't demonstrate their benefits in a way that appealed to me.

Also, the last comment from Barry struck me as quite patronizing. Maybe some humility would be better?

@Svein - I was annoyed with Joylon for his trollish comment on the Vista rant post.

Re examples, I'm sorry if they're too complex for you. I'm a developer, not a teacher, or even a marketer. I'll post what I find interesting. If I posted for different reasons, I wouldn't be posting for much longer (i.e. I would stop).

@Robert - they do not implement IEnumerable<T> / IEnumerator<T> because those interfaces descend from IEnumerable / IEnumerator, and since Delphi doesn't have a rooted type system like C#, that causes a problem implementing the Current property. What should IEnumerator.Current return on an instance you get back from IEnumerable<Integer>?

@IEnumeable vs IEnumerable<T>Well, you would not *have* to provide the inheritance as in the .Net world.Since there is no legacy code that could be using IEnumerable, you could just opt for the generic one as the only one.Otherwise an IEnumerable with Current as a Pointer could be possible as well (only used in cases where you don't know the type and thus pointer wouldn't be too bad a type choice).IMO forcing class inheritence on collections is a very, very bad thing.As an experienced .Net developer you probably came to appreciate the ease in which you can make your methods/classes work with any kind of container. As long as the container implemented either IEnumerable, ICollection<T>, etc...Requiring this to be TEnumerable<T> means that *every* container had to be change to inherit from it, as opposed to just implement an interface to make it more usuable for the new Delphi.

Don't get this wrong, I think you showed what you're made of with this first compiler of yours. :-)Just wanted to get some ideas of my chest...

I'm not sure if you are deliberately mis-spelling my name, or why you would, but please note that you are.

My words are not "weasel" at all.

You seem to be trying to show the benefits of some new language features. I am suggesting that your attempt is falling short because your chosen examples are poor.

Applying these language features and techniques in the case of this example results in more complex and harder to follow code, not less and easier to follow.

As for the suggestion that the simplest possible implementation is machine code, those certainly are "weasel words".

I suspect that you know full well that by "simplest possible" I meant easiest to follow, and a machine code listing would fail that test just as much as your anon methods and generics approach.

My "trollish comment" on your Vista rant was intended as a tongue in cheek bit of light hearted ribbing, given that us "D7 Luddites" were lambasted by *some* CodeGear insiders for not "getting" the new IDE.

If a little light hearted ribbing is "trolling" then colour me green and call me "Shrek".

@Jolyon - Mistake in the name is unintentional. Some time ago, I had mentally pronounced your name as I misspelled it. Sorry.

My point re "simple" is that it means something different to what you appear to think it means. Machine code is indeed more simple than a high-level language, because the primitives do less and are easier to understand individually.

Similarly, what you describe as simpler than the technique I outlined has primitives which do less and are easier to understand individually, in so far as they are less abstract.

However, higher-level abstraction allows writing more "dense" code. Sometimes, the best abstraction to use for a particular situation is passing code as an argument.

The example I had here was the problem of creating an enumerable stream. Normally, that requires creating two separate classes, and writing at least 3, if not 4 different methods. I reduced it down to a single method call.

You may not see the utility of the example, because you don't see the point of creating enumerable streams (perhaps you have never worked with C# LINQ). However, that criticism is specific to you, and is not general.