Friday, December 15, 2006

I'm on my way back from Javapolis 2006, unfortunately missing the third
day of the conference. One of the innovations this year was a bank of ten whiteboards where people brainstormed about the future of the Java language
and platform. I had a camera with me but I realized too late that I should
have taken snapshots of the contents of all those boards before I left. I hope
someone else will. The only snapshots I took were the vote tallies that I
discuss below.

There were three issues discussed on those boards that I'd like to share with you.

Closures

The first board contained a discussion of closures, including an informal
vote of people "in favor of" and "against" adding support
for closures. I gave a talk about closures yesterday afternoon, which explained
the goals and design criteria, showed how they fit quite smoothly into the
existing language and APIs, and more importantly explained in detail why
anonymous instance creation expressions do not solve the problems closures
were designed to solve. Before my talk about closures, the vote was about 55%
in favor and 45% against. After my talk the vote was 71% in favor and 29%
against. I don't think any "against" votes were added to the tally
after my talk. There was also a BOF (Birds-Of-a-Feather) session in
the evening discussing closures. Of the 20 or so attendees none admitted being against
adding closures. I'm sure the fact that the BOF was scheduled opposite a free
showing of Casino Royale didn't help, but I had hoped to hear from opponents more about their concerns. We discussed a number of issues and
concerns, most of which had been discussed on my blog at one time or another.

One issue that I discussed with a few individuals earlier and then at
the BOF was the idea of adding a statement whose purpose is to yield a
result from the surrounding closure, which you could use instead of or
in addition to providing a result as the last expression in the closure.
It turns out that adding this feature make closures no longer suitable
for expressing control abstractions. In other words, adding
this feature to the language would make closures less
useful! This is a very counterintuitive result. I first understood the
issue by analyzing the construct using Tennent's Correspondence Principle, but it is much easier for most people to
understand when the result is presented as specific examples that fail
to work as expected. For now I'll leave this as an exercise to the reader,
but I'll probably write more about it later. Incidentally, I believe the
folks who designed Groovy got closures wrong for exactly this reason.

There was a video made of my talk that will be posted to the Javapolis
website. Ted Neward also interviewed me on video, and Bill Venners
interviewed me on audio. As soon as these are available on the web
I'll blog pointers to them.

Native XML Support

Mark Reinhold gave a talk on adding native (i.e. language-level) support
for XML into Java. Though they were not presented as such, some
people prefer to think of the proposal as separable into a language
extension part and an API part. The
proposed APIs appear to be an improvement over the existing alternatives.
However, the language extension for writing XML literals appears to be only
marginally more convenient than the XML construction techniques provided by
libraries in JDOM. I personally
would like to see the new APIs pursued but XML creation provided in the
JDOM way. Mark took a vote by show of hands on how people felt about the two
issues, but I couldn't see the tally. There was also an informal tally about
adding native XML support on one of the whiteboards. The result was 29% in
favor and 71% against.

Constructor Invocation for Generics

Another much-discussed issue appeared on one of the whiteboards: the
verbosity of creating instances of generic types. This is typical:

The problem here is that the type parameters have to be repeated twice.
One common "workaround" to this problem is to always create your
generic objects using static factories, but the language should not force
you to do that. A number of different syntax forms have been suggested for
fixing this:

var map = new HashMap<Pair<String,Integer>,Node>();

This unfortuately requires the addition of a new keyword. Another:

final map = new HashMap<Pair<String,Integer>,Node>();

This reuses an existing keyword, but at the same time it also makes the
variable final. Another variation on this idea:

map := new HashMap<Pair<String,Integer>,Node>();

In my opinion these three forms all suffer the same flaw: they place the type
parameters in the wrong place. Since the variable is to be used later in the
program, the programmer presumably wants control over its type and the reader wants the type to be clear from the variable declaration. In this case
you probably want the variable to be a Map and not a
HashMap. An idea that addresses my concern is:

Map<Pair<String,Integer>,Node> map = new HashMap();

Unfortunately, this is currently legal syntax that creates a rawHashMap. I don't know if it is possible to change the meaning of
this construct without breaking backward compatibility. Another possibility:

Map<Pair<String,Integer>,Node> map = new HashMap<>();

You can see clearly by the presence of the empty angle brackets that the
type parameters have been omitted where the compiler is asked to infer them.
Of the alternatives, this is my favorite. I don't think it will be too hard to
implement in javac using the same techniques that work for static factories of
generic types.

21 comments:

Anonymous
said...

Map<Pair<String,Integer>,Node> map = new HashMap<>();

I think you haven't taken into consideration the looks of this construct ... which is the ugliest of all the other alternatives. Special signs should not be abused this way.

And, of all the alternatives, I find this one the most readable:

var map = new HashMapMap<Pair<String,Integer>,Node>();

If someone wants a Map, it can always be explicitly declared.

Also, I really hope you find a nice and clear syntax for closures, because the proposals I saw look horrible.

If you are going to create a Map with new, 98% of the time it is going to be a HashMap. It therefore makes sense to me that Map should nominate a default implementation class (or overloaded set of static methods). So we don't need the information as to which type to construct in the client code.

because closure are a widely misused term you shouldn't say Groovy designers got them wrong. Not without an example of what you mean. Then we can discuss about what exactly is wrong. Oh, and I hope you don't think in terms of implementation here.

I can't go into details but I feel like the features being added to Java are "tacked on" and don't really fit well. Even generics bring along their own set of issues.

Closures support looked ugly and crippled, and native XML support is ugly period. I think Groovy builders got it right.

I'd like to see Java stay simple, and support for higher level language features be provided by another language, be it JRuby or Groovy.

If it still compiles to Java classes (Groovy does this and JRuby will) there will be no concern about interoperability. And those who are 'concerned' about performance can write the logic-complex portions in the dynamic language, and the rest in Java.

If you find yourself needing those language features everywhere... Well then maybe the Java hammer isn't the answer and the JVM should focus on support for high- level dynamic languages.

Possibility to write Map<Pair<String,Integer>,Node> map = new HashMap<>(); would be a nice syntax sugar. Although most verbose it is also the best of all possibilities mentioned. Introducing a new keyword for sugar only is IMO out of question and this proposal looks exactly like what it is: slightly sweetened version of an already existing language construct.

Why can't you get rid of the new, since there isn't the "allocate on the stack vs allocate on the heap" issue that C++ has? What I mean is:X x = new X(y, z);becomesX x(y, z);So the case you cite would become:Map<Pair<String,Integer>,Node> map();

It should still be possible to writeX x; // An uninitialised XX x = another; // Assign from another instanceexactly as is done now.

Firstly, it keeps the distinction between interface (Map) and implementation (HashMap).

Secondly, when you're omitting type parameters from a method call, you don't have to write <> either; and as you say, the analogy between a constructor call and a static method call is so strong that both would be implemented "in javac using the same techniques". Analogy in syntax should follow.

Thirdly, it is a compatible extension: If the new object is used only as the initializer of a declaration, then (apart from compiler warnings) there is no difference between a "raw" HashMap and a correctly typed one. The byte code is the same either way.

@Rémi: Concise typing can make things easier for a reader - like a comment that's checked by the compiler. When you declare a "Map" variable, then you document that the following code will not rely on any specifics of HashMap, TreeMap,...

If you change your mind about the kind of Map you want to use later on, you can be sure you'll only have to change the initializer.

By the way, type inference should also work for method arguments! Semantically, passing an argument to a parameter is almost the same as initializing a variable.

* Doesn't close the door on reified generics where someday in the future 'new HashMap()' and 'new HashMap<>()' may not be equivalent statements at runtime anymore.

* No new keywords or operators

* Has the type parameters on the declaration, which is where they belong for consistency. It's the most Java-like.

* As axel notes, I can declare the variable as Map, which means it can become TreeMap or WeakHashMap (or any of the concurrent maps) by simply changing one line - yes, it may be a local variable, but it doesn't mean I'm not going to return it from the method! (besides, who said I can't use this syntax on instance variables?)

The argument against the "var" syntax for type inference seems rather weak. The reason I'd like to leave out the type for the local variable is because I *don't* want to figure out the type. Local variables have such limited scope that I'm content to let the compiler figure it out, even if it means that the variable's type is HashMap where I would have written Map. If it were important to declare the type for some particularly confusing code then I can already do so by writing out the type explictly.

@lordpixel: The "reified generics" argument for empty angle brackets is not very strong - the problem would have to be solved for static methods anyway.

@brian: that goes in the direction of either a scripting language (no need for a safety net if you're not going to jump) or a mind-bending functional language (making the code as dense as possible so you can jump higher). It's actually a feature of Java that it's so plain and obvious, most of the time. I'd rather mix languages than mix up Java.

I haven't heard any reason yet against turning raw types into inferred types, wherever possible - the compiler could still warn (as currently for a raw type) if it cannot determine a sensible typing.

Following lordpixel's idea - how about Map<Pair<String,Integer>,Node> map = new HashMap<...>() or <*> And the parser would replace ... or * with the last generic type? Or maybeMap<T = <Pair<String,Integer>,Node>> map = new HashMap<T>()

About Me

Neal Gafter is a Computer Programming Language Designer, Amateur Scientist and Philosopher.
He works for Microsoft on the evolution of the .NET platform languages.
He also has been known to Kibbitz on the evolution of the Java language.
Neal was granted an OpenJDK Community Innovators' Challenge award for his design and
implementation of lambda expressions for Java.
He was previously a software engineer at Google working on Google Calendar, and a senior staff engineer at Sun Microsystems,
where he co-designed and implemented the Java language features in releases 1.4 through 5.0. Neal is coauthor of
Java Puzzlers: Traps, Pitfalls, and Corner Cases (Addison Wesley, 2005). He was a member of the C++ Standards
Committee and led the development of C and C++ compilers at Sun Microsystems, Microtec Research, and Texas Instruments.
He holds a Ph.D. in computer science from the University of Rochester.