I'd like to talk about the idea of having closures in D. For those
who aren't used to the term closure, a closure is something like a
pointer to function, and usually is created by a syntactic construct
that appears in source code in the form of an anonymous "code
literal".
Closures in Smalltalk
=====================
I'll start by talking about closures in Smalltalk, which calls them
"Blocks". (I'm assuming people can read Smalltalk syntax, which may
be a bad assumption, but that's what I'm assuming.)
A block in Smalltalk is code that appears between square brackets.
An example would be
symbolTable at: identifier ifAbsent: [ defaultSymbol ].
The '[ defaultSymbol ]' is a block that is executed only if the
'identifier' key is not present in 'symbolTable'.
This example isn't very motivating, because defaultSymbol is just
a value. Consider this however
symbolTable at: identifier ifAbsent: [ UndeclaredSymbol new ].
The '[ UndeclaredSymbol new ]' block only gets executed if the
identifier is not present. This could be useful because making a
new UndeclaredSymbol might involve lots of computation.
The examples above return values, but blocks can also have side
effects. For example,
symbolTable at: identifier ifAbsent: [
undeclaredIdentifiers add: identifier.
UndeclaredSymbol new
].
Notice that the last expression evaluated in a block is what is
returned for the value of the block. (Please, no flames about
bracket placement style - I know some people prefer the open
bracket to go on a line by itself.)
Blocks can also take arguments, for example -
symbolTable keysAndValuesDo: [ :identifier :entry |
output nextPut:
'(', identifier printString, '->', entry printString, ')'.
].
which causes a printable form of a symbol table to be added to
the 'output' object.
Blocks get much of their power from being able to access local
context. In an example above, the block contains references to
'identifier' and 'undeclaredIdentifiers', which might be instance
variables or perhaps local variables. This ability puts blocks on an
equal footing with the more usual if/then/else and looping control
structures. (In fact these "built-in" control structures are
themselves implemented using blocks in Smalltalk, but that point is
incidental - the main thing is that having access to local context
gives an important kind of expressive power.)
Incidentally, blocks can declare variables local to the block.
(I'm not going to show an example for this.)
Blocks also have a capability that might be surprising to people
used to C, namely the ability to return (a value) from the method
that contains the block. So for example
referencesTo: identifier
"Answer a collection of all references to identifier"
| entry | "this is a local variable"
entry := self at: identifier ifAbsent: [ ^ Set new ].
^ programSource referencesToEntry: entry
The ^ in Smalltalk is like 'return' in C. The block '[ ^ Set new ]'
will, if executed, immediately return a value from the 'referencesTo:'
method, and the following statement will not be executed at all.
Here again, the ability to return a value from the enclosing function
is one that adds expressive power, and puts blocks on an equal footing
with built-in control structures.
Blocks in Smalltalk are fairly representative of closures that are
available in other programming languages, taking into account that
Smalltalk is more of an imperative language than a functional language
(in the sense of "functional programming").
Closures in D
=============
For the sake of discussion let's assume that closures will be included
in D. What are the implications of this? Clearly there are syntax
questions (about which more later), and there are questions about what
capabilities should be supported. For the moment let's use
[ ParameterList ]{ FunctionBody }
as the syntax for a closure in D, without being too particular about
types or return values, and look at semantic issues.
First, the body of a closure is the same as that for a function body,
which means closures can declare variables local to the closure
activation.
Second, the code in the body of the closure should have access to the
context of the surrounding code, notably instance variables and local
variables. Instance variables aren't a problem (just have the closure
keep a pointer to the object in question), but local variables are a
little trickier, since a closure can outlive the stack frame of the
method that created it. This could be handled by taking any variables
used in closure bodies and allocating them in a heap object and having
the closure hang on to a pointer to that. Yes, there is a cost
associated with doing that allocation, but only for the cases where
there are closures that reference local variables.
[Note added during editing - People who are used to stack-based local
variable frames may be surprised by the idea that local variables
might outlive the function activation. Rather than get into that
in too much detail here I think a follow-up post is a better idea.
Expect that shortly.]
Third, recursion - as with ordinary functions, closures need a
stack frame for their local variables so recursion can occur.
A closure activation is not the same as a closure.
Fourth, nested closures. Closure bodies can contain other closure
bodies. This brings with it all the usual machinery needed for
nested functions, in particular accessing local variables of
containing closures (when that happens). As with functions,
closures that have local variables accessed by contained closures
would allocate those local variables in an object in the heap.
Fifth, interchangeability with function pointers. Getting a closure
to masquerade as a function pointer is a little tricky (because a
function pointer is only one pointer, whereas a closure is two
pointers - a pointer to code and a pointer to context that the code
runs in). Maybe it's useful to do that, I don't know. But going the
other way is easy - turning a function pointer into a closure just
means adding a null pointer for execution context, and perhaps a dummy
closure body that simply invokes through the function pointer. The
more closures and function pointers are interchangeable (as far as the
function getting them as arguments is concerned), the more the
language follows the "Law of Least Astonishment".
Incidentally, closures being interchangeable with function pointers
means closures should have all the same parameter passing modes.
In particular, if functions have OUT and INOUT parameters, then
closures should have them.
Sixth, invoking a closure. How about the same syntax as that
for invoking through a function pointer?
(*closure_parameter)( argument, argument, argument, ... )
Seventh, closure returns. It's important to distinguish between
returning from a closure and returning from a function body containing
a closure. Blocks in Smalltalk use the usual return syntax (^) to
indicate return from the containing method, and use the last
expression evaluated as the return value for the block. Should there
be a way to return a value from a closure out of the middle of the
closure body? Should 'return' mean return from the closure or return
from the function? I can see arguments on both sides.
For nested closures, you can imagine an "intermediate" return that
returns from a containing closure but not the outermost function body.
Blocks in Smalltalk don't include this capability (innermost return or
outermost return are the only choices); I have no experience with it,
so I don't know how useful it is. But if D is to include multi-level
breaks, it probably should also include multi-level returns.
Eighth, other control structures. Consider the following code
fragment -
while( p < p_limit ){
candidate = *p;
candidate.enumerate_parts(
[ part ]{
if( part.passesTest() ) break;
}
);
p ++;
}
Notice what's happening. The closure is testing a value, and in the
event that the test succeeds, the 'break' breaks out of the while in
the function body outside the closure - a non-local transfer of
control. There are similar examples for 'continue' and (dare I say
it) 'goto'. These control structures aren't that different from
uplevel returns in terms of what mechanisms are needed.
Notice that any non-local control transfer - either a return out of or
a break/continue/goto into - a function body containing a closure can
fail at runtime if that function activation has already returned.
This is true for intermediate (closure) function bodies as well as
outermost function bodies. Presumably such a case would generate
some kind of exception.
Despite the difficulties of these exceptional conditions (and the
reactions I'm sure some people will have at the idea of using 'goto'
out of a closure), I would argue that if the language is going to be
orthogonal then as much as possible closures should look like the
regular built-in control structures and non-local control transfers
should be supported.
Syntax for Closures
===================
For invoking a closure - how about using the pointer to function
notation
(*closure)( argument, argument, argument, ... )
Note: personally I see nothing wrong with using the unadorned
function call notation for pointers-to-function
function_pointer( argument, argument, argument, ... )
and would use that for closures if it were allowed for function
pointers, but I don't want to start an argument on that topic.
For declaring a closure variable - this should be similar (not
identical) to declaring a function pointer. How will function
pointers be declared in D?
For writing a closure body - well I like the syntax I've been
using, assuming it doesn't cause too many problems:
[ ParameterList ] { FunctionBody }
The types of the parameters and the return value need to go somewhere,
which is sort of a pain. The parameter types might go in the
parameter list, much like function definitions, e.g.,
[ int a, int b ] { a < b ? a : b }
but for the return type, I'm at more of a loss. Perhaps other people
will have some suggestions.
Final Remarks
=============
I think closures should be considered for D. I've tried to lay out
the basic issues and implications of D having closures. I can say
based on experience that closures are extremely useful - given a
choice between closures and exception handling, I think closures
are more generally useful. Also, I think having closures in some
form is a prerequisite to writing a good set of Collection classes,
which is a large part of the motivation for having some form of
generics (or templates).

Quite a lot of text, and honestly it only confused me.
What's now the advantage of a closure? How is it implemented? Especially in
featureless systems where C originally was developed for. In UserSpace
resources are easy to get/implement, but what in kernel space? In embedded
OS-less applications? This places are the real motherhood of C.
Maybe more for a presentational hint, try to get faster to the point, and
explain it good and in a clean easy to understand way. This text sounds
just so universitary from professors who's thing they most love in life is
to hear themself's talking.
- Axel

Here's an example of where a closure might be useful:
quicksort(
string_pointer_array,
number_of_elements,
[ char * a, char * b ]{ strcmp( a, b ) }
);
So we get to write the compare function right here rather than
having to define a separate function.
Closures are also useful for dealing with collections:
some_collection.do(
[ ElementType element ]{
if( element.satisfies_some_condition() ) return element;
}
);
return NULL;
This search pattern occurs frequently, so we might very well have
a function that does it already, in which case we could write
ElementType e =
some_collection.first_element_that(
[ ElementType e ]{ e.satisfies_some_condition() }
);
Closures can also be used to create control structures that haven't
(yet) gotten into the language:
foreach_element_pair( array_a, array_b,
[ ElementType a_element, ElementType b_element ]{
/* do something with a_element and b_element */
}
);
Do these examples help?
Implementation - closures don't need interpretation or any sort
of runtime compilation; they are compiled in the same way that
functions are. The non-local control transfers need some sort
of mechanism similar to an exception handling mechanism - I
expect Walter can provide some details here.
To respond to the individual points -

Quite a lot of text, and honestly it only confused me.

Sorry for the confusion. There was a lot of ground to cover, and
I thought breaking it into small pieces would help the presentation.

What's now the advantage of a closure? How is it implemented? Especially in
featureless systems where C originally was developed for. In UserSpace
resources are easy to get/implement, but what in kernel space? In embedded
OS-less applications? This places are the real motherhood of C.

Compiling a closure is much like compiling a function. The cost of
calling a closure is about the same as calling through a pointer to
function; the major difference is that a closure needs an extra
pointer to establish its execution context. Functional closures like
the quicksort example above have essentially no more overhead than
calling through pointer to function. Closures that need access to
local context require one heap allocation on function entry (regardless
of how many closures in the function there are).
I tried to talk about some advantages up above.

Maybe more for a presentational hint, try to get faster to the point, and
explain it good and in a clean easy to understand way. This text sounds
just so universitary from professors who's thing they most love in life is
to hear themself's talking.

string_pointer_array,
number_of_elements,
[ char * a, char * b ]{ strcmp( a, b ) }
);
So we get to write the compare function right here rather than
having to define a separate function.
Closures are also useful for dealing with collections:
some_collection.do(
[ ElementType element ]{
if( element.satisfies_some_condition() ) return element;
}
);
return NULL;
This search pattern occurs frequently, so we might very well have
a function that does it already, in which case we could write
ElementType e =
some_collection.first_element_that(
[ ElementType e ]{ e.satisfies_some_condition() }
);
Closures can also be used to create control structures that haven't
(yet) gotten into the language:
foreach_element_pair( array_a, array_b,
[ ElementType a_element, ElementType b_element ]{
/* do something with a_element and b_element */
}
);
Do these examples help?

If I understand correctly, the only thing similar to Closures that I've seen
are Java inner (and anonymous) classes.
Does inner classes support resolve the problem?
Ciao

If I understand correctly, the only thing similar to Closures that I've seen
are Java inner (and anonymous) classes.
Does inner classes support resolve the problem?

Nested classes (assuming that's what Java inner classes are) seem
like a related but distinct mechanism. An important aspect of
closures is being able to write them right where you need them
rather than defining them elsewhere. The experience I have with
nested classes (which experience is in C++) is that nested
classes don't provide this capability, which is a significant
lack.
Beyond that, it's possible that some sort of nested class facility
could provide a semantically equivalent mechanism. I'm not
aware of any however.

If I understand correctly, the only thing similar to Closures that I've

are Java inner (and anonymous) classes.
Does inner classes support resolve the problem?

Nested classes (assuming that's what Java inner classes are) seem
like a related but distinct mechanism. An important aspect of
closures is being able to write them right where you need them
rather than defining them elsewhere. The experience I have with
nested classes (which experience is in C++) is that nested
classes don't provide this capability, which is a significant
lack.

Agreed.
But inner classes are not simply nested classes. A nested class is what in
Java
is called "static inner class", while non-static inner classes behave much
like closures.
Ciao

The message doesn't say what either "static inner classes" or
"non-static inner classes" are.
After doing a bit of research on the web, I found these pages:
http://java.sun.com/products/jdk/1.1/docs/guide/innerclasses/spec/innerclasses.doc1.htmlhttp://java.sun.com/products/jdk/1.1/docs/guide/innerclasses/spec/innerclasses.doc2.html
The phrase "nested classes" as I used the term was intended to include
both what Java calls static inner classes and (non-static) inner
classes. Java allows anonymous inner classes which can be written
in expressions.
Java inner classes provide some capabilities that closures do not:
1) Multiple instances can be made;
2) They have encapsulated state (the inner class's instance
variables);
3) They can have more than one "entry point" (method).
Closures are less general in some ways but also offer some advantages:
1) Closures offer a more compact notation for what they do;
2) Closures allow non-local control transfers (which as near as
I can tell inner classes in Java do not).
So I don't see either closures or inner classes as providing all
the capabilities of the other. Like the earlier message said,
they are related but distinct mechanisms.
Perhaps inner classes are useful enough to merit inclusion. That's
a big topic for discussion.
The advantages listed for closures argue for closures being considered
for inclusion in D whether inner classes are included or not.
That all make sense?
"Roberto Mariottini" <rmariottini lycosmail.com> writes:

If I understand correctly, the only thing similar to Closures that I've

are Java inner (and anonymous) classes.
Does inner classes support resolve the problem?

Nested classes (assuming that's what Java inner classes are) seem
like a related but distinct mechanism. An important aspect of
closures is being able to write them right where you need them
rather than defining them elsewhere. The experience I have with
nested classes (which experience is in C++) is that nested
classes don't provide this capability, which is a significant
lack.

Agreed.
But inner classes are not simply nested classes. A nested class is what in
Java
is called "static inner class", while non-static inner classes behave much
like closures.
Ciao

If I understand correctly, the only thing similar to Closures that I've
seen are Java inner (and anonymous) classes.
Does inner classes support resolve the problem?

What an insight! Yes thinking about it inner classes are the same thing in
java, only with another skin. (They running frame is created upon
execution, it can outlive it's father routine, you can have several of them
at the same time etc... )
So I used already something like closures without knowing it :o)
- Axel
--
|D) http://www.dtone.org

form is a prerequisite to writing a good set of Collection classes,
which is a large part of the motivation for having some form of
generics (or templates).

First, I don't understand the need of closures for Collections
(not that I understand them - closures - very well at all =)).
But since D has dynamic and associative arrays, I think most
requirements for containers are covered.

Dynamic and associative arrays barely scratch the surface of container
types.
Consider a video game, which maintains its world database as an octree of
objects of various types. Would it not be nice to be able to write code
something like this:?
octreeScene.ForEach( [ octreeNode node ] { if ( node.IsAlive() &&
camera.CanSee(node.boundSphere) ) node.Render(); } );
This is alot of expressive power... something that may help D out alot. I
really needs something to make up for lack of generics.
The alternative is a fairly clunky functor object. This is almost a way to
get the compiler to write functor objects for you... I have a feeling they'd
be pretty useful for callbacks as well.
Sean
"Pavel "EvilOne" Minayev" <evilone omen.ru> wrote in message
news:9sb9jv$ik3$1 digitaldaemon.com...

form is a prerequisite to writing a good set of Collection classes,
which is a large part of the motivation for having some form of
generics (or templates).

First, I don't understand the need of closures for Collections
(not that I understand them - closures - very well at all =)).
But since D has dynamic and associative arrays, I think most
requirements for containers are covered.

Dynamic and associative arrays barely scratch the surface of container
types.
Consider a video game, which maintains its world database as an octree of
objects of various types. Would it not be nice to be able to write code
something like this:?
octreeScene.ForEach( [ octreeNode node ] { if ( node.IsAlive() &&
camera.CanSee(node.boundSphere) ) node.Render(); } );
This is alot of expressive power... something that may help D out alot. I
really needs something to make up for lack of generics.

for-each loops is something that I was going to offer and that, IMHO,
must be in the language that supports dynamic and associative arrays.
So, this example could be rewritten as (note, this is also a syntax
proposal):
for(node: octreeNode)
{
if (node.IsAlive() && camera.CanSee(node.boundSphere))
node.Render();
}

In fact, after reading this one more time, I really can't
understand HOW can this be implemented in a non-interpreted
language like D. Nor do I see any advantages of it...
Maybe I'm wrong. I'm not sure that I understood anything
right, besides, I'm not familiar with Smalltalk, so could
you please give a shorter and more self-explanatory example? =)

In fact, after reading this one more time, I really can't
understand HOW can this be implemented in a non-interpreted
language like D. Nor do I see any advantages of it...

Closures can be used as a form of generic programming, but in
a slightly different way than C++ templates. Perl has them and
makes them very convenient to use; Perl has the advantage that
the run-time environment is guaranteed to include a Perl
compiler[1].
Here's a trivial example of a closure in Perl:
for my $color (qw[red yellow orange green blue purple violet])
{
*$color = sub { qq<<FONT COLOR="\U$color\E"> _</FONT>> };
}
Don't worry about the hairballs, that's just Perl. :)
This automatically generates functions/subroutines red(), yellow(),
orange()... etc., which generate some HTML to render strings in the
specified color.
In C/C++, you could approach this with #defined macros (except for
the trick where function red() generates the color tag "RED"), but
closures offer quite a bit more power than this.
The above example in and of itself could be evaluated completely at
compile time, but you could hypothetically replace the list of color
names above with a list of arbitrary dynamic text strings. At this
point, you _have_ to have dynamic binding and a compiler in the
runtime. My guess is that this need is incompatible with the desire
to have D be easy to implement.
-RB
[1] Inasmuch as perl is a compiled language.

The above example in and of itself could be evaluated completely at
compile time, but you could hypothetically replace the list of color
names above with a list of arbitrary dynamic text strings. At this
point, you _have_ to have dynamic binding and a compiler in the
runtime. My guess is that this need is incompatible with the desire
to have D be easy to implement.

Thanks for explaining, now everything seems much clearer.
And I don't think that it's worth implementing. As you've
said it requires dynamic binding, which would probably bloat
the code... no thanks. Just not the "D way", I suppose -
or at least it feels like that.

The Perl example is generating functions dynamically at runtime.
Closures are more static, similar to nested functions. In fact
the main difference between a nested function and a closure is
that closures are written in line at the point of use.
Imagine a C-like language with nested functions, eg
int
outer_function(){
int
inner_function(){
/* inner function body ... */
}
needs_pointer_to_function( inner_function );
}
A closure is like a function except the code is written directly where
it's needed:
int
outer_function(){
needs_pointer_to_function( []{ /* inner function body ... */ } );
}
The Perl example (as I understand it - I'm not a Perl expert) is
generating function names (and I guess function bodies) dynamically
and hooking those functions into the environment.
Closures don't do any dynamic generation. The code is compiled
just like a function body, and the value of the closure is stored
in an ordinary variable, like pointer-to-function, except closures
have two pointers - one for the code body, one for the execution
environment - rather than just one.
Russell Borogove <kaleja estarcion.com> writes:

In fact, after reading this one more time, I really can't
understand HOW can this be implemented in a non-interpreted
language like D. Nor do I see any advantages of it...

Closures can be used as a form of generic programming, but in
a slightly different way than C++ templates. Perl has them and
makes them very convenient to use; Perl has the advantage that
the run-time environment is guaranteed to include a Perl
compiler[1].
Here's a trivial example of a closure in Perl:
for my $color (qw[red yellow orange green blue purple violet])
{
*$color = sub { qq<<FONT COLOR="\U$color\E"> _</FONT>> };
}
Don't worry about the hairballs, that's just Perl. :)
This automatically generates functions/subroutines red(), yellow(),
orange()... etc., which generate some HTML to render strings in the
specified color.
In C/C++, you could approach this with #defined macros (except for
the trick where function red() generates the color tag "RED"), but
closures offer quite a bit more power than this.
The above example in and of itself could be evaluated completely at
compile time, but you could hypothetically replace the list of color
names above with a list of arbitrary dynamic text strings. At this
point, you _have_ to have dynamic binding and a compiler in the
runtime. My guess is that this need is incompatible with the desire
to have D be easy to implement.
-RB
[1] Inasmuch as perl is a compiled language.

The following is a paper about adding closures to C++. Unfortunately
there method does not seem to support closures that persist beyond
the lifetime of the activation record of the enclosing function.
http://people.debian.org/~karlheg/Usenix88-lexic.pdf
<la7y6nvo shamko.com> wrote in message
news:s7cbsieubtt.fsf michael.shamko.com...

I'd like to talk about the idea of having closures in D. For those
who aren't used to the term closure, a closure is something like a
pointer to function, and usually is created by a syntactic construct
that appears in source code in the form of an anonymous "code
literal".
Closures in Smalltalk
=====================
I'll start by talking about closures in Smalltalk, which calls them
"Blocks". (I'm assuming people can read Smalltalk syntax, which may
be a bad assumption, but that's what I'm assuming.)
A block in Smalltalk is code that appears between square brackets.
An example would be
symbolTable at: identifier ifAbsent: [ defaultSymbol ].
The '[ defaultSymbol ]' is a block that is executed only if the
'identifier' key is not present in 'symbolTable'.
This example isn't very motivating, because defaultSymbol is just
a value. Consider this however
symbolTable at: identifier ifAbsent: [ UndeclaredSymbol new ].
The '[ UndeclaredSymbol new ]' block only gets executed if the
identifier is not present. This could be useful because making a
new UndeclaredSymbol might involve lots of computation.
The examples above return values, but blocks can also have side
effects. For example,
symbolTable at: identifier ifAbsent: [
undeclaredIdentifiers add: identifier.
UndeclaredSymbol new
].
Notice that the last expression evaluated in a block is what is
returned for the value of the block. (Please, no flames about
bracket placement style - I know some people prefer the open
bracket to go on a line by itself.)
Blocks can also take arguments, for example -
symbolTable keysAndValuesDo: [ :identifier :entry |
output nextPut:
'(', identifier printString, '->', entry printString, ')'.
].
which causes a printable form of a symbol table to be added to
the 'output' object.
Blocks get much of their power from being able to access local
context. In an example above, the block contains references to
'identifier' and 'undeclaredIdentifiers', which might be instance
variables or perhaps local variables. This ability puts blocks on an
equal footing with the more usual if/then/else and looping control
structures. (In fact these "built-in" control structures are
themselves implemented using blocks in Smalltalk, but that point is
incidental - the main thing is that having access to local context
gives an important kind of expressive power.)
Incidentally, blocks can declare variables local to the block.
(I'm not going to show an example for this.)
Blocks also have a capability that might be surprising to people
used to C, namely the ability to return (a value) from the method
that contains the block. So for example
referencesTo: identifier
"Answer a collection of all references to identifier"
| entry | "this is a local variable"
entry := self at: identifier ifAbsent: [ ^ Set new ].
^ programSource referencesToEntry: entry
The ^ in Smalltalk is like 'return' in C. The block '[ ^ Set new ]'
will, if executed, immediately return a value from the 'referencesTo:'
method, and the following statement will not be executed at all.
Here again, the ability to return a value from the enclosing function
is one that adds expressive power, and puts blocks on an equal footing
with built-in control structures.
Blocks in Smalltalk are fairly representative of closures that are
available in other programming languages, taking into account that
Smalltalk is more of an imperative language than a functional language
(in the sense of "functional programming").
Closures in D
=============
For the sake of discussion let's assume that closures will be included
in D. What are the implications of this? Clearly there are syntax
questions (about which more later), and there are questions about what
capabilities should be supported. For the moment let's use
[ ParameterList ]{ FunctionBody }
as the syntax for a closure in D, without being too particular about
types or return values, and look at semantic issues.
First, the body of a closure is the same as that for a function body,
which means closures can declare variables local to the closure
activation.
Second, the code in the body of the closure should have access to the
context of the surrounding code, notably instance variables and local
variables. Instance variables aren't a problem (just have the closure
keep a pointer to the object in question), but local variables are a
little trickier, since a closure can outlive the stack frame of the
method that created it. This could be handled by taking any variables
used in closure bodies and allocating them in a heap object and having
the closure hang on to a pointer to that. Yes, there is a cost
associated with doing that allocation, but only for the cases where
there are closures that reference local variables.
[Note added during editing - People who are used to stack-based local
variable frames may be surprised by the idea that local variables
might outlive the function activation. Rather than get into that
in too much detail here I think a follow-up post is a better idea.
Expect that shortly.]
Third, recursion - as with ordinary functions, closures need a
stack frame for their local variables so recursion can occur.
A closure activation is not the same as a closure.
Fourth, nested closures. Closure bodies can contain other closure
bodies. This brings with it all the usual machinery needed for
nested functions, in particular accessing local variables of
containing closures (when that happens). As with functions,
closures that have local variables accessed by contained closures
would allocate those local variables in an object in the heap.
Fifth, interchangeability with function pointers. Getting a closure
to masquerade as a function pointer is a little tricky (because a
function pointer is only one pointer, whereas a closure is two
pointers - a pointer to code and a pointer to context that the code
runs in). Maybe it's useful to do that, I don't know. But going the
other way is easy - turning a function pointer into a closure just
means adding a null pointer for execution context, and perhaps a dummy
closure body that simply invokes through the function pointer. The
more closures and function pointers are interchangeable (as far as the
function getting them as arguments is concerned), the more the
language follows the "Law of Least Astonishment".
Incidentally, closures being interchangeable with function pointers
means closures should have all the same parameter passing modes.
In particular, if functions have OUT and INOUT parameters, then
closures should have them.
Sixth, invoking a closure. How about the same syntax as that
for invoking through a function pointer?
(*closure_parameter)( argument, argument, argument, ... )
Seventh, closure returns. It's important to distinguish between
returning from a closure and returning from a function body containing
a closure. Blocks in Smalltalk use the usual return syntax (^) to
indicate return from the containing method, and use the last
expression evaluated as the return value for the block. Should there
be a way to return a value from a closure out of the middle of the
closure body? Should 'return' mean return from the closure or return
from the function? I can see arguments on both sides.
For nested closures, you can imagine an "intermediate" return that
returns from a containing closure but not the outermost function body.
Blocks in Smalltalk don't include this capability (innermost return or
outermost return are the only choices); I have no experience with it,
so I don't know how useful it is. But if D is to include multi-level
breaks, it probably should also include multi-level returns.
Eighth, other control structures. Consider the following code
fragment -
while( p < p_limit ){
candidate = *p;
candidate.enumerate_parts(
[ part ]{
if( part.passesTest() ) break;
}
);
p ++;
}
Notice what's happening. The closure is testing a value, and in the
event that the test succeeds, the 'break' breaks out of the while in
the function body outside the closure - a non-local transfer of
control. There are similar examples for 'continue' and (dare I say
it) 'goto'. These control structures aren't that different from
uplevel returns in terms of what mechanisms are needed.
Notice that any non-local control transfer - either a return out of or
a break/continue/goto into - a function body containing a closure can
fail at runtime if that function activation has already returned.
This is true for intermediate (closure) function bodies as well as
outermost function bodies. Presumably such a case would generate
some kind of exception.
Despite the difficulties of these exceptional conditions (and the
reactions I'm sure some people will have at the idea of using 'goto'
out of a closure), I would argue that if the language is going to be
orthogonal then as much as possible closures should look like the
regular built-in control structures and non-local control transfers
should be supported.
Syntax for Closures
===================
For invoking a closure - how about using the pointer to function
notation
(*closure)( argument, argument, argument, ... )
Note: personally I see nothing wrong with using the unadorned
function call notation for pointers-to-function
function_pointer( argument, argument, argument, ... )
and would use that for closures if it were allowed for function
pointers, but I don't want to start an argument on that topic.
For declaring a closure variable - this should be similar (not
identical) to declaring a function pointer. How will function
pointers be declared in D?
For writing a closure body - well I like the syntax I've been
using, assuming it doesn't cause too many problems:
[ ParameterList ] { FunctionBody }
The types of the parameters and the return value need to go somewhere,
which is sort of a pain. The parameter types might go in the
parameter list, much like function definitions, e.g.,
[ int a, int b ] { a < b ? a : b }
but for the return type, I'm at more of a loss. Perhaps other people
will have some suggestions.
Final Remarks
=============
I think closures should be considered for D. I've tried to lay out
the basic issues and implications of D having closures. I can say
based on experience that closures are extremely useful - given a
choice between closures and exception handling, I think closures
are more generally useful. Also, I think having closures in some
form is a prerequisite to writing a good set of Collection classes,
which is a large part of the motivation for having some form of
generics (or templates).