Been awhile since I posted.
I was wondering what other people thought about this addition to C++ by
Apple. Heh.

It doesn't solve any major problem and the syntax is ugly. If it was
standard C++, I'd use it anyway. Since it isn't, I'll stick with
Boost.Lambda (which is more limited and much uglier, but portable).
--
Rainer Deyke - rainerd eldwood.com

I find it strange that people are continuing to reinvent nested functions
in ugly ways.

No offense, but Ruby and Python users would probably get a good chuckle at
hearing the D creator saying that ;)
But in the case of C/C++, I consider it a minor miracle that they were able
to pull it off without making it 10x uglier than what they actually ended up
with.

Walter: may I ask with this, reddit posts and dobb's code post, why the
interest in this particular topic right now? Didn't you implement this a long
time ago?

Walter: may I ask with this, reddit posts and dobb's code post, why
the interest in this particular topic right now? Didn't you implement
this a long time ago?

It was one of the first things implemented in D.
But I was thinking about it lately as I prepare the materials for the
Compiler Construction seminar in a few weeks. Everyone tells me they are
simple and obvious, yet in language after language they get added in
bizarre ways that suggest that *somebody*, me or them, is just not
getting it.
So I thought it was time for an article.
I mean, how can one miss the most stunningly obvious syntax:
int foo(int a)
{
int bar(int i) { return a + 3; }
return bar(3);
}
and that nested functions are useful for many more things than passing
around pointers to them. For example, as helper functions.

Walter: may I ask with this, reddit posts and dobb's code post, why
the interest in this particular topic right now? Didn't you implement
this a long time ago?

It was one of the first things implemented in D.
But I was thinking about it lately as I prepare the materials for the
Compiler Construction seminar in a few weeks. Everyone tells me they are
simple and obvious, yet in language after language they get added in
bizarre ways that suggest that *somebody*, me or them, is just not
getting it.
So I thought it was time for an article.
I mean, how can one miss the most stunningly obvious syntax:
int foo(int a)
{
int bar(int i) { return a + 3; }
return bar(3);
}
and that nested functions are useful for many more things than passing
around pointers to them. For example, as helper functions.

Which further makes it a mystery why this has to be badly reinvented.
(But note that the gcc extension has the same problem D1 has with them;
the closures don't survive the end of the function. This was fixed with D2.)

True for nested functions but nested functions are not blocks.
typedef void (*foo_t)(void);
foo_t getSomeFunc()
{
int x = 4;
void bar(void) {
printf("%d\n", x);
}
return bar;
}
foo_t func = getSomeFunc();
func(); // Undefined, nested functions cannot be returned and in any
case, no variable capturing will happen, so the variable x is undefined
when func is called above getSomeFunc.
On the contrary, the following would work very well with blocks:
typedef void (^foo_t)(void);
foo_t getSomeBlock()
{
int x = 4;
void ^bar(void) {
printf("%d\n", x);
}
return Block_copy(bar);
}
foo_t blk = getSomeBlock();
blk(); // Will happily print 4
Block_release(blk); // free up memory of block
Though, they could probably make some hacks to gcc to add the copying
and variable capturing to nested functions. Such hacks would most likely
break a lot of code as the ABIs would have to be changed.
This means that blocks are very suitable for enqueuing in something
using the thread pool pattern, this is what GCD does and in my humble
opinion GCD is a very very nice concurrency model for C. The main issues
now being that GCD is seriously in need of standardisation through
POSIX, and the C language.
I suppose that blocks will not be added to C officially for quite some
time, and therefore neither in POSIX. I think they should work out a
version of C that includes that in the standard, like they have done for
embedded applications, i.e. a set of extensions to the core language.
/ Matt

S. wrote:
I find it strange that people are continuing to reinvent nested
functions in ugly ways.

The blocks are not nested functions, they are more like closures. There
are some block copy functions that move a block to the heap (including
the captured variables).
Nested functions can usually not be called after the defining function
have returned. You cannot return blocks directly (since they are located
on the stack), but you can return a heap copy of the block.

S. wrote:
I find it strange that people are continuing to reinvent nested
functions in ugly ways.

The blocks are not nested functions, they are more like closures.

Nested functions do closures in a straightforward way, so by leaving off
nested functions they were forced to make an ugly syntax <g>. This is
why I shake my head in just not understanding the process that led to
their design.

There
are some block copy functions that move a block to the heap (including
the captured variables).
Nested functions can usually not be called after the defining function
have returned. You cannot return blocks directly (since they are located
on the stack), but you can return a heap copy of the block.

Nested functions do closures in a straightforward way, so by leaving off
nested functions they were forced to make an ugly syntax <g>. This is
why I shake my head in just not understanding the process that led to
their design.

That is because nested functions and block closures have different purposes.
Block closures avoid delocalization of code. For example, in Smalltalk,
you could write something like the following to calculate the sum of the
first 100 integers:
s := 0.
1 to: 100 do: [ :i | s := s + i ].
Packaging the 's := s + i' block inside a nested function instead would
(aside from unnecessarily having to also pass 's' by reference or by
value-result to the function) require you to move back and forth between
the piece of code above and the nested function in order to comprehend
it. More generally, it allows you to write custom control structures,
such as map, fold, and filter to generate more compact an coherent code.
This is especially prevalent in functional languages (Scheme, LISP,
Haskell, the whole ML family) and Smalltalk or Ruby.
Research seems to indicate [1] that there is indeed a measurable
readability benefit to this style of programming.
The benefit of nested functions is something different. In essence, a
nested function allows you to share state with the parent function
(absent shared state, there is little reason to not make the nested
function a top-level function instead). However, nested functions are
hidden inside their parent, which means that they are not reusable; on
top of that, object-oriented languages already have means to share state
between multiple methods (which is sometimes imperfect, but so are
nested functions).
So, there are perfectly good reasons to have block closures, but not
nested functions. Obviously, there are trade-offs, which may not appeal
to everybody, but the design is still a rational one.
Reimer Behrends
[1] http://infoscience.epfl.ch/record/138586

The benefit of nested functions is something different. In essence, a
nested function allows you to share state with the parent function
(absent shared state, there is little reason to not make the nested
function a top-level function instead).

The reason for nested functions are:
1. factoring out repeated code within the function into a nested function
2. locality - a nested function is adjacent to where it is used, rather
than somewhere else
3. isolation - by scope rules, it is easy to see the nested function and
all calls to it
These may be mundane, but they make for the code being more readable
than the clumsy workarounds necessary without them. They can also often
be inlined by the compiler, making them a no-cost abstraction.

However, nested functions are
hidden inside their parent, which means that they are not reusable;

That's actually a feature, see (3).

on
top of that, object-oriented languages already have means to share state
between multiple methods (which is sometimes imperfect, but so are
nested functions).

Nested classes (Java) and functors (C++) are pretty ugly next to nested
functions.
One could argue that gcc has them as an extension but nobody uses them.
My experience with adding extensions to DM C++ is that nobody uses them
because it is non-standard, not because they are a bad idea.

So, there are perfectly good reasons to have block closures, but not
nested functions. Obviously, there are trade-offs, which may not appeal
to everybody, but the design is still a rational one.
Reimer Behrends
[1] http://infoscience.epfl.ch/record/138586

The reason for nested functions are:
1. factoring out repeated code within the function into a nested function
2. locality - a nested function is adjacent to where it is used, rather
than somewhere else
3. isolation - by scope rules, it is easy to see the nested function and
all calls to it

You don't actually need nested functions for that. Consider, for
example, Tarjan's algorithm [1] to find strongly connected components in
a directed graph. It's normally a prime example of where you would use
nested functions: It's got some scaffolding to set up temporary state,
then repeatedly calls a recursive helper algorithm that operates on the
temporary state. Without going into the details of the algorithm, you
can basically write it as follows (using Scala as an example, though of
course it would also work in D, OCaml, etc.):
class DirectedGraph[V] {
...
def stronglyConnectedComponents: Set[Set[V]] = {
def recursiveHelper(vertex: V) = { ... }
...
}
...
}
However, you could also factor out the algorithm into a class of it's own:
class StronglyConnectedComponents[V](val graph: DirectedGraph[V]) {
def findAll: Set[Set[V]] = { ... }
private def recursiveHelper(vertex: V) = { ... }
}
This satisfies all your three criteria above. The potentially repeated
code is factored out into its own function, the helper function is next
to where it is used, and you can see who call it since it is private.
However, unlike a nested function, you have additional benefits: You can
reuse the helper function, for starters. And since the algorithm isn't
an integral part of the DirectedGraph class anymore, you can furthermore
improve the refactoring, since the algorithm doesn't need the full
graph, but only a set of vertices and a successor function:
class StronglyConnectedComponents[V](val vertices: Set[V],
val successors: V => Set[V]) {
def findAll: Set[Set[V]] = { ... }
private def recursiveHelper(vertex: V) = { ... }
}
Now we can use the algorithm not only for directed graphs, but also for
other, similar data structures that do not need to be a subtype of
DirectedGraph.

However, nested functions are hidden inside their parent, which means
that they are not reusable;

That's actually a feature, see (3).

As I showed above, you can have both locality and reusability. These are
not mutually exclusive properties.

on top of that, object-oriented languages already have means to share
state between multiple methods (which is sometimes imperfect, but so
are nested functions).

Nested classes (Java) and functors (C++) are pretty ugly next to nested
functions.

I am not even talking about nested classes, just classes (see above).
Classes are the standard vehicle in most object-oriented languages for
having multiple functions operating on shared state.
Mind you, I'm not saying that this is the only approach to deal with
such issues (and it has imperfections of its own). Just that it is an
alternative, equally valid way of doing things.
Reimer Behrends
[1]
http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm

However, you could also factor out the algorithm into a class of it's own:

Yes, so instead of a nested function with everything right there next to
its use, I have to go elsewhere in the file looking for a class. While
this is doable, I don't think it's half as nice. I also think that
trying to force something that's naturally a function that simply
accesses the outer function's variables into a class with member
variables that have to be copied, etc., is less desirable.
I don't think you'll agree with me <g>, but take this example and
rewrite it using classes:
int foo(int a)
{
int bar(int i) { return a + 3; }
return bar(3) + bar(4);
}
and let's compare!

One could argue that gcc has them as an extension but nobody uses them.
My experience with adding extensions to DM C++ is that nobody uses them
because it is non-standard, not because they are a bad idea.

And that's why I won't be using Apple's new closures, regardless of whether
I'd actually find them useful.

One could argue that gcc has them as an extension but nobody uses them.
My experience with adding extensions to DM C++ is that nobody uses them
because it is non-standard, not because they are a bad idea.

And that's why I won't be using Apple's new closures, regardless of whether
I'd actually find them useful.

Yeah, that is a problem. They did add them to GCC (not only Clang)
though, and as far as using GCC extensions, most compilers adopt them
one way or the other. Hopefully, they will be imported from the Apple
branch of GCC to the mainline.
Many things in GCC are just incredibly useful, including the vector_size
attribute, and people are using them, it is not a matter of being non
standard, but rather that the extensions are good enough that enough
vendors implement them.
I think that the blocks are crying out for standardisation, maybe not by
ISO, but by some informal means between compiler vendors.
/ Mattias

One could argue that gcc has them as an extension but nobody uses them.
My experience with adding extensions to DM C++ is that nobody uses them
because it is non-standard, not because they are a bad idea.

And that's why I won't be using Apple's new closures, regardless of whether
I'd actually find them useful.

Even for OSX-specific code?
I see no reason not to use them in a Cocoa app for instance, nor when
you want to schedule tasks using Grand Central Dispatch. You're already
bound to Apple-world anyway.
--
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

One could argue that gcc has them as an extension but nobody uses them.
My experience with adding extensions to DM C++ is that nobody uses them
because it is non-standard, not because they are a bad idea.

And that's why I won't be using Apple's new closures, regardless of whether
I'd actually find them useful.

Yup. I always write code with portability in mind, because my target platforms
have a way of changing. If I were writing user-mode applications for OSX
this might be different, since I'd be using Cocoa or whatever anyway, but...

I see no reason not to use them in a Cocoa app for instance, nor when
you want to schedule tasks using Grand Central Dispatch. You're already
bound to Apple-world anyway.

Nested functions do closures in a straightforward way, so by leaving off
nested functions they were forced to make an ugly syntax <g>. This is
why I shake my head in just not understanding the process that led to
their design.

Well, one problem would be that nested functions are already a feature
in GCC, and they are not compatible (code would be broken by an
introduction, at least the ABI, an ABI breaking modification of nested
GCC function could have been controversial at least).

Nested functions can usually not be called after the defining function
have returned. You cannot return blocks directly (since they are
located on the stack), but you can return a heap copy of the block.

This is handled in D automatically.

Have not looked at D's implementation of this, looks nice I think. In C
this would mean stealing some extra keywords, adding _Delegate or
whatever and then a file stddelegate.h that #define delegate _Delegate.
Also, without forcing GC on the user, returning a delegate would be
tricky to say the least (if efficiency is needed).
So given the situation, they probably did a decent choice, a bit like
patching an x86 CPU like AMD did for the AMD64.
/ Mattias

It is actually intended to be added to C which is kinda weird and against C's
road map. C has decided that is a minimal abstraction over hardware to allow
cross platform and not much else. C1X will probably feature barely any new
features unless apple coughs up a lot of money.
I don't want this in C, but it is better suited to C++ if they didn't already
have this planned http://en.wikipedia.org/wiki/C%2B%2B0x#Lambda_functions_and_expressions

I don't find the syntax that ugly. And I'd say the feature is at its
best when used with Grand Central Dispatch. As page 13 of this same
review says, it enables developer to transform synchronous operations
such as this:
NSDictionary *stats = [myDoc analyze];
[myModel setDict:stats];
[myStatsView setNeedsDisplay:YES];
[stats release];
into asynchrnous with minimal effort:
dispatch_async(dispatch_get_global_queue(0, 0), ^{
NSDictionary *stats = [myDoc analyze];
dispatch_async(dispatch_get_main_queue(), ^{
[myModel setDict:stats];
[myStatsView setNeedsDisplay:YES];
[stats release];
});
});
Without it, you'd have to create a bunch of different functions,
probably accompanied by a context struct, to make this work,
complexifying the conde and making it harder to follow. It's just
syntaxic sugar when you think about it, but it's that syntaxic sugar
that makes it practical to express operations in a way they can run
asynchrnously.
--
Michel Fortin
michel.fortin michelf.com
http://michelf.com/