Wednesday, September 13, 2006

This post describes a draft proposal for adding support for closures to the Java programming language for the Dolphin (JDK 7) release. It was carefully designed to interoperate with the current idiom of one-method interfaces. The latest version of the proposal and a prototype can be found at http://www.javac.info/.

This revision of the closures proposal eliminates function types, introduces the Unreachable type, and gives related reachability rules. A closure's "return" type is now called its result type. The abbreviated invocation syntax now takes any statement, not just a block, to make nesting work nicely.

Gilad Bracha, Neal Gafter, James Gosling, Peter von der Ahé

Modern programming languages provide a mixture of primitives for composing programs. C#, Javascript, Ruby, Scala, and Smalltalk (to name just a few) have direct language support for delayed-execution blocks of code, called closures. A proposal for closures is working its way through the C++ standards committees as well. Closures provide a natural way to express some kinds of abstraction that are currently quite awkward to express in Java. For programming in the small, closures allow one to abstract an algorithm over a piece of code; that is, they allow one to more easily extract the common parts of two almost-identical pieces of code. For programming in the large, closures support APIs that express an algorithm abstracted over some computational aspect of the algorithm. This proposal outlines a specification of closures for Java without introducing function types.

1. Closure Literals

We introduce a syntactic form for constructing a closure value:

Expression3

Closure

Closure

FormalParameters { BlockStatementsoptExpressionopt }

Example

2. The type of a closure

A closure literal has a "closure type" that exists transiently at compile time. It is converted to some object type at compile-time; See Closure Conversion for details. It is a compile-time error if a closure is not subject to a closure conversion.

The type of a closure is inferred from its form as follows:

The argument types of a closure are the types of the declared arguments.

If the body of the closure ends with an expression, the result type of the closure is the type of that expression; otherwise if the closure can complete normally, the result type is void; otherwise the result type is Unreachable.

The set of thrown types of a closure are those checked exception types thrown by the body.

Example

The following illustrates a closure being assigned to a variable of a compatible object type.

3. synchronized parameters

We allow the qualifier synchronizedto be used on formal method arguments. When a closure is passed to such a parameter, the closure conversion (see Closure Conversion) is allowed to close over its context: the closure may use non-final local variables from enclosing scopes, and may use control-transfer statements such as break, continue, and return that transfer control outside of the body of the closure. An overriding method declaration may only add (i.e. may not remove) the synchronized qualfier to formal parameters compared to a method it overrides; an abstract method that overrides one whose parameter is declared synchronized must itself declare the parameter synchronized. If a closure is passed to a parameter that is not declared synchronized, then the only local variables from enclosing scopes it may access are those that are marked final.

Note: this qualifier is necessary to enable API writers to distinguish synchronous from asynchronous receivers of closures. For compatibility with existing APIs, most of which are asynchronous, the default is not allowing closing over the context. The constraint on overriders preserves the substitutability of subtypes: if a closure is allowed when passed to the superclass method, it must be allowed when passed to the subclass method. A consequence of the overriding constraint is that no existing overridable method can be retrofitted to accept synchronous closures without breaking source compatibility. Static methods such as AccessController.doPrivileged can be retrofitted. The synchronized keyword isn't an ideal choice, but it is the best we've found so far.

Names that are in scope where a closure is defined may be referenced within the closure. It is a compile-time error to access a local variable declared outside the closure from within a closure unless the closure is passed to a parameter that is declared synchronized or the variable was declared final.

Note: a local variable that is referenced within a closure is in scope in the body of the closure. Consequently the lifetime of objects referenced by such variables may extend beyond the time that the block containing the local declaration completes.

4. Closure conversion

A closure literal may be assigned to a variable or parameter of any compatible object type by the closure conversion:

There is a closure conversion from every closure of type T to every interface type that has a single method m with signature U such that T is compatible with U. A closure of type T is compatible with a method U iff all of the following hold:

Either

The result type of T is either the same as the return type of U or

There is an assignment conversion from the result type of T to the return type of U; or

the result type of T is Unreachable.

T and U have the same number of arguments.

For each corresponding argument position in the argument list of T and U, both argument types are the same.

Every exception type in the throws of T is a subtype of some exception type in the throws of U.

If the closure is not passed to a parameter that was declared synchronized then

it is a compile-time error if the closure being converted contains a break, continue or return statement whose execution would result in a transfer of control outside the closure

it is a compile-time error if the closure being converted refers to a non-final local variable or parameter declared in an enclosing scope.

The closure conversion to a non-synchronized parameter applies only to closures that obey the same restrictions that apply to local and anonymous classes. The motivation for this is to help catch inadvertent use of non-local control flow in situations where it would be inappropriate. Examples would be when the closure is passed to another thread, or stored in a data structure to be invoked at a later time when the method invocation in which the closure originated no longer exists.

Note: The closure conversion occurs at compile-time, not at runtime. This enables javac to allocate only one object, rather than both a closure and an anonymous class instance.

We are considering an additional qualifier on non-final local variables that allows a closure to access the variable.

Example

We can use the existing Executor framework to run a closure in the background:

5. Exception type parameters

To support exception transparency, we add a new type of generic formal type argument to receive a set of thrown types. [This deserves a more formal treatment] What follows is an example of how the proposal would be used to write an exception-transparent version of a method that locks and then unlocks a java.util.concurrent.Lock, before and after a user-provided block of code:

This uses a newly introduced form of generic type parameter. The type parameters E are declared to be used in throws clauses. If the extends clause is omitted, a type parameter declared with the throws keyword is considered to extend Exception (instead of Object, which is the default for other type parameters). This type parameter accepts multiple exception types. For example, if you invoke it with a block that can throw IOException or NumberFormatException the type parameter E would be IOException|NumberFormatException. In those rare cases that you want to use explicit type parameters with multiple thrown types, the keyword throws is required in the invocation, like this:

You can think of this kind of type parameter accepting disjunction, "or" types. That is to allow it to match the exception signature of a closure that throws any number of different checked exceptions. If the block throws no exceptions, the type parameter would be the type null.

Type parameters declared this way can be used only in a throws clause or as a type argument to a generic method, interface, or class whose type argument was declared this way too.

6. Non-local control flow

One purpose for closures is to allow a programmer to refactor common code into a shared utility, with the difference between the use sites being abstracted into a closure. The code to be abstracted sometimes contains a break, continue, or return statement. This need not be an obstacle to the transformation. A break or continue statement appearing within a closure may transfer to any matching enclosing statement provided the target of the transfer is in the same innermost ClassBody.

A return statement always returns from the nearest enclosing method or constructor.

If a break statement is executed that would transfer control out of a statement that is no longer executing, or is executing in another thread, the VM throws a new unchecked exception, UnmatchedNonlocalTransfer. Similarly, an UnmatchedNonlocalTransfer is thrown when a continue statement attempts to complete a loop iteration that is not executing in the current thread. Finally, an UnmatchedNonlocalTransfer is thrown when a return statement is executed if the method invocation to which the return statement would transfer control is not on the stack of the current thread.

7. Definite assignment

The body of a closure may not assign to a final variable declared outside the closure.

A closure expression does not affect the DA/DU status of any variables it names.

8. The type Unreachable

We add the non-instantiable type java.lang.Unreachable, corresponding to the standard type-theoretic bottom. Values of this type appear only at points in the program that are formally unreachable. This is necessary to allow transparency for closures that do not return normally.Unreachable is a subtype of every type (even primitive types). No other type is a subtype of Unreachable. It is a compile-time error to convert null to Unreachable. It is an error to cast to the type Unreachable.

Example

The following illustrates a closure being assigned to a variable of the correct type.

8. Unreachable statements

An expression statement in which the expression is of type Unreachable cannot complete normally.

10. Abbreviated invocation syntax.

A new invocation statement syntax is added to make closures usable for control abstraction:

AbbreviatedInvocationStatement

Primary ( ExpressionList ) Statement

AbbreviatedInvocationStatement

Primary ( Formals ) Statement

AbbreviatedInvocationStatement

Primary ( Formals : ExpressionList ) Statement

This syntax is a shorthand for the following statement:

Primary ( ExpressionList, ( Formals ) { Statement } );

This syntax makes some kinds of closure-receiving APIs more convenient to use to compose statements.

Note: There is some question of the correct order in the rewriting. On the one hand, the closure seems most natural in the last position when not using the abbreviated syntax. On the other hand, that doesn't work well with varargs methods. Which is best remains an open issue.

We could use the shorthand to write our previous example this way

withLock(lock) { System.out.println("hello");}

This is not an expression form for a very good reason: it looks like a statement, and we expect it to be used most commonly as a statement for the purpose of writing APIs that abstract patterns of control. If it were an expression form, an invocation like this would require a trailing semicolon after the close curly brace of a controlled block. Forgetting the semicolon would probably be a common source of error. The convenience of this abbreviated syntax is evident for both synchronous (e.g. withLock) and asynchronous (e.g. Executor.execute) use cases. Another example of its use would be a an API that closes a java.io.Closeable after a user-supplied block of code:

The idea of synchronized (synchronous) formal parameters looks interesting, but I don’t think you have exploited it to a full extent. You are giving restrictions for synchronous parameters at their call site, but there are no restrictions on their usage which makes them quite useless since it still does not protect from UnmatchedNonlocalTransfer. I will only welcome a proposal where a code that compiles without errors and warnings cannot throw the aforementioned exception. This can be achieved, for example, by further exploiting idea of synchronous parameters by restricting their usage (they should not be allowed to leak outside the current context), or by exploiting the notion of exception transparent method. I do not see any use-case where the notion of exception-transparency is not equal to synchronous, thus I do not see any need to introduce two new unrelated concepts. IMHO, only one of them is sufficient to solve uncaught UnmatchedNonlocalTransfer issue and to distinguish synchronous and asynchronous use-cases. See also my comments for version 0.1 of the proposal.

Roman: that approach has been tried, and results in a bifurcation of the type system: these synchronous closures cannot be assigned to a field or a variable of type Object, otherwise they could leak into the "wrong" context via variables. You would need the constraints enforced at the VM level.

It turns out that you can get exception transparency across (asynchronous) threads, and it is very useful! I'm working on a blog post that explains this in detail.

Neal, I can't agree that annotations only belong to declarations only. Since closure is not just an expression, but an executable fragment, it does make sense to annotate that fragment. Example that comes into main is @Transactional attribute on methods of EJB3 beans.

Neal, I am eager to see your proposal that addresses exception transparency across threads. In comments to your earlier post I have given a snippet of code that illustrates how to do that for SwingUtilities.invokeAndWait. However, my code was not satisfactory since it relied on catch(E e) construct that is not allowed by the current Java generics system due to implementation difficulties and I do not see an easy work-around for that.

And yes, synchronous closures (any closure that uses non-local control flow) should have a type that you simply cannot [syntactically] declare on any variable, and thus cannot assign it anywhere. The only operations you should be allowed to do on them are invocations and passing them as arguments to other methods that accept synchronous closures. I do not see any need to enforce it at JVM level. It will quite suffice to enforce it during compilation, basically copying the spirit of generics implementation in Java 5 (generics are also enforced during compile-time only). Frankly, I do not see any use-case where this restriction on synchronous closure will prove to be too limiting. All the use-cases for synchronous closures that were identified so far do not need the power to assign closure to arbitrary variable.

Glad to see the work moving forward (and for the moment closer to my preferences, but I guess I'll survive however it turns out; pros and cons to everything). On a side note (I'm curious), did my blog post discussing the use of for blocks for closures influence the syntax for the block form? Also, on this subject, could the spec elaborate on how to make "continue" work in custom fashions (such as in my prior example)?

Neil, good to see the spec moving forward - I like it better than before. In the interest of minimalism, why exactly do we need the Unreachable type? I don't see how we need to write down the signature of the always-throws-closure anywhere. As a comparison, until today I've never felt the need for writing down the type of 'null' either. As long as the typechecker can prove compatibility with my concrete signature I'm fine.

And yes, this requires enforcement at the JVM level. I'm not sure how it would be done though, but it is required. The reason is that accessing dynamically scoped values (which closures of dynamic extent are) outside their dynamic scope is fatal for a stack-based VM, so either the bytecode verifier or the VM itself must enforce that values of dynamic extent are never stored in locations that aren't also of dynamic extent.

Not so with closures of indefinite extent, but there one has different problems that some of us have blogged about enough already.

Still, I think we should explore the possibility of JVM extensions for closures of indefinite extent (ignore non-local exits of dynamic extent for the time being). The idea would be to find a way to optimize access to closed variables so that in general it does not require an additional level of indirection as compared to non-closed locals. The obvious way to do this (put activation frames on the heap) is not thread safe, therefore also out of the picture. The same applies to continuations of indefinite extent.

If it turns out that reasonable extensions to the JVM can be proposed for dealing with closures of indefinite extent then now is the time to: explore such extensions, OR, ensure that the current proposal does not preclude adding such support later.

I am afraid I don't get the "synchronized" thing. snychronized allows you to use non local transfers I think. But the transfers are are written in the closure and we already have the limitation that I can't assign such a closure to a vaiable, or not? You said it is for API writers, but what you want is a compile time check here somehow. Why not use an annotation then? Ah ok, because you wan't to enforce that such a method can't be overwritten using other asynchronous. But can't the same annotation do that job too?

- The current proposal is for continuations of dynamic extent. This should be mentioned explicitly early on in the document.

- It might be wise to not preclude future addition of support for closures of indefinite extent.

- Neither this proposal nor the original (0.1) cover JVM extensions or guidance to compiler writers...

- It be nice if there was a Closure virtual super-class of all closures, and DynamicExentClosure and IndefiniteExentClosure sub-classes of Closure, so all closures of dynamic extent could be sub-classes of DynamicExentClosure and so on. Then the synchronized keyword can go away -- declaring arguments to be instances of DynamicExentClosure would serve that purpose.

- Groovy has a keyword, 'it', for referencing the one argument of closures that define no arguments.

I think this is the right approach (if you can add keywords) for things like "the closure itself."

So, consider adding a keyword, say, 'recurse' that can refer to the closure currently being executed by the VM, so recursive closures can be written.

Thus 'this' in a closure can keep referring to the instance of the class the method of which created the closure that uses 'this' without losing the ability of the closure to refer to itself (and use reflection on itself) -- you just add a new name for the current closure.

- If 'return' returns from the enclosing lexical scope, how does a closure return to its caller? Perhaps I missed this.

blackdrag: annotations are not a back-door for adding declarations qualifiers. Other than the possibility that the annotation itself is illegal, an annotation does not affect the semantics of any other code.

Nico: your proposal to add new interfaces for the different types of closures doesn't easily allow existing interfaces to be retrofitted to using them.

It is intentional that implementation strategy is absent from the proposal. This is just the language side. There are a number of possible ways this could work on the implementation side. Many of the implementation techniques don't require cooperation among compiler writers and don't require VM extensions.

Any syntax for accessing the closure within itself would introduce a difficulty in abstracting arbitrary blocks of code (see my post on Tennent's Correspondence Principle).

The result value of a closure can be produced by placing an expression at its end. There are no currently proposed constructs for "breaking out of" a closure. This is similar to other control constructs (e.g. how do you "break out of" the else clause of an if-statement or the try block of a try-catch statement)

We have a few ideas to consider for defining looping APIs that work naturally with break and continue, but we'll leave that for future discussion.

I don't understand your first comment. Also, my point was that future addition of closures of indefinite extent should not be precluded now, which to me means that we should have some idea of how they would fit in, that they should fit in fairly naturally.

I thought that implementors of other languages on the Java platform wanted closures. Therefore a discussion of the JVM/compiler implications of closures seems appropriate.

I'm not an expert on the JVM, but I can't see how to obtain a reference to the current ivocation frame, nor how to alloca() objects on the stack (maybe I missed it when I looked), which to me is all the more reason to have at least a non-normative discussion of how one might implement this proposal.

Also, I think you may be taking the correspondence principle too far. Said principle should not prevent the addition of new keywords, like 'recurse'. Addition of new keywords might always be a problem for other reasons, but languages should provide mechanisms by which they can evolve new keywords (e.g., program files should declare the version of the language that they were written to), otherwise one paints oneself into a corner.

As for your last comment, I take it to mean that non-local exits from closure invocations are out. Yes?

I think this proposal is much clearer and it fits better with the current Java language.The use of an additional type of generic parameter, the exception, can be useful in other situations (not only closures).

Why don't you just drop the bit about synchronized parameters. Instead wrap non-final locals that are accessed inside a closure or an inner class within a final tuple. IE a closure is always an inner class (closure convertion) and inner classes can access locals.

Hi,My concern is very simple, this proposal looks very much a syntax sugar for creating a inner anonymous class.Bcuz, if we want to create a method takes a closures as a parameter, we are forced to create a new interface for that closures?This sould annoy the developers that using closure.

Neal: I am not much of a fan of Tennant's principle, in particular I don't like macro assemblers and pass by name - both supported by the principle.

Also you can use the principle to support either the closures argument or the inner class argument. E.g. to support the inner class argument you might say ...

Using Tennant's principle you should make everything a method, so that code can be refactured by cut and pasting. Therefore you should not base a closure on a block, because it has outlying syntax. In particular return doesn't just return from the inner block but from the enclosing method! To compensate for this annomily in syntax and behavior you need to annoyingly add break and continue statements to the language.

Therefore using Tennant's principle, a closure should simply be short syntax for an anonymuous inner class. This way it is easily refactored into an inner class (anon or otherwise) or a stand alone class.

I really like the new proposal. Some of the mechanisms could perhaps be generalized:

- If you ever wanted to bring back direct closure types (and I'm not sure that they are really necessary), it would have to be in the form of anonymous interfaces, maybe complemented by a typedef-like construct (which is something I think Java should have, anyway).

- break, continue, return: if these statements were to throw exceptions, one could implement custom reactions to them. But this sounds like one of the "things to consider" for the looping API.

- return: the difficulties one encounters with "return" and closures is similar to using the "extract method" refactoring on a sequence of statements. The refactoring does not work if those statements contain a "return". What would be needed here is a mechanism very similar to exceptions: The ability to define locations in dynamic scope that one can return to with a result. Additionally, there should be a standard location meaning "go up one level in the call tree". For closures, a loop utility method could use that location to implement "return" itself (if "return" were to throw an exception).

Is there a good reason why the checked exception handling got so complex? Wouldn't it be easier to just uncheck the exceptions that are not listed in the interface signature?

Even if you go along with the current proposal it would definitely be a good idea to make the conversion procedure uncheck exceptions by default (unless they are listed either explicitly or polymorphically). This would give the developers the freedom of not having to mention them 99% of the time, while preserving the power to still process the exceptions when needed.

And oh, when will Java already get local type inference :) Wouldn't be any trouble at all with a bit of inferer magic.

I work as software engineer since many yeas. I did scheme(lisp dialect), I did smalltalk, I did C and C++ and finally I crossed the way of JAVA...And?It was kind of relief coming with a deep sigh: None of the long known obscure constructs with academic pretence. None of those hard to read STL Generic stuff anymore. A simple and yet powerfull language. Object-oriented. Easy but with the kind of flexibility you need.I give you one citation of Joshua Blochs "Effective Java". He says "Throw exceptions appropriate to the abstraction". I give you Joshuas words not because of the exception thing, I like the "appropriate to the abstraction" part: I mean - better use the appropriate, simple, easy to grasp abstraction than a obscure closure. Yes, I found the example given here to be rather obscure. it is definitly not simple, not easy to grasp and it left me without any clue what the value of closure is. Most alarming is this syntax gibberish...I dare to tell you that it's going to rise error-rates in complex projects. Furthermore I can't see what closure would make really better. Is there a "big problem" in java that "the community complained about" for the last few years and to which the answer is "closure". I missed that discussion. I give you some more things to think about: "never change a running system (If you don't have to)" and "keep it stupid simple - kiss". IMHO closures violate both principles - no "have to", not "simple".Guys: Witty ideas (closures) are not always clever! If you have another boring friday sitting around in your R&D-Deparment offices then better come up with something more convincing...which solves a problem rather than become one...Sorry for that emotional post

About Me

Neal Gafter works for Microsoft on the evolution of the .NET platform languages.
He also has been known to Kibbitz on the evolution of the Java language.
Neal was granted an OpenJDK Community Innovators' Challenge award for his design and
implementation of lambda expressions for Java.
He was previously a software engineer at Google working on Google Calendar, and a senior staff engineer at Sun Microsystems, where he co-designed and implemented the Java language features in releases 1.4 through 5.0. Neal is coauthor of "Java Puzzlers: Traps, Pitfalls, and Corner Cases" (Addison Wesley, 2005). He was a member of the C++ Standards Committee and led the development of C and C++ compilers at Sun Microsystems, Microtec Research, and Texas Instruments. He holds a Ph.D. in computer science from the University of Rochester.