No, my opinions as expressed in this blog are not those of my employer. They're mine alone. That's why they're called "my opinions." Helloooo.

Thursday, December 6, 2007

Why does Set.contains() take an Object, not an E?

Virtually everyone learning Java generics is initially puzzled by this. Why should code like the following compile?

Set<Long> set = new HashSet<Long>(); set.add(10L); if (set.contains(10)) { // we won't get here! }

We're asking if the set contains the Integer ten; it's an "obvious" bug, but the compiler won't catch it because Set.contains() accepts Object. Isn't this stupid and evil?

A popular myth is that it is stupid and evil, but it was necessary because of backward compatibility. But the compatibility argument is irrelevant; the API is correct whether you consider compatibility or not. Here's the real reason why.

Let's say you have a method that wants to read from a Set of Foos:

public void doSomeReading(Set<Foo> foos) { ... }

The problem with this signature is it won't allow a Set<SubFoo> to be passed in (where SubFoo is, of course, a subtype of Foo).

To preserve the substitutability principle, any method that wants to read from a set of Foos should be equally able to read from a set of SubFoos, so let's tweak our signature:

public void doSomeReading(Set<? extends Foo> foos) { ... }

Perfect!

But here's the catch: if Set.contains() accepted type E instead of type Object, it would now be rendered completely unusable to you inside this method body!

That signature tells the compiler, "don't let anyone ask about containment of an object unless you are damn sure that it's of the exact right type." But the compiler doesn't know the type -- it could be a Foo, or SubFoo, or SubSubFoo, or who knows what? Thus the compiler would have to forbid everything -- the only safe parameter to a method like this is null.

This is the behavior you want for a method like Set.add() -- if you can't make damn sure of the type, don't allow it. And that's why add() accepts only type E while contains() accepts anything.

So the distinction I'm making is between read methods and write methods, right? No, not exactly -- notice that Set.remove() also accepts Object, and it's a write method. The real difference is that add() can cause "damage" to the collection when called with the wrong type, and contains() and remove() cannot.

Uniformly, methods of the Java Collections Framework (and the Google Collections Library too) never restrict the types of their parameters except when it's necessary to prevent the collection from getting broken.

So what to do about this vexing source of bugs, as illustrated at top? Well, when I typed that code into IntelliJ, it flagged a warning for me right away. This let me know to either fix the problem or add an annotation/comment to suppress it. Problem solved.

Static analysis plays an extremely important role in the construction of bug-free software. And the very best kind of static analysis is the kind that pops up in your face the second you write something questionable.

The moral of the story: if you're not coding in a good, modern IDE, you're coding with one hand tied behind your back!

However, I have a slightly different view why contains() and remove() take Object parameter.

If the reason was substitutability, they probably could have solved it the way harryh suggests, or with a template method:<T extends E> boolean contains(T elem); I think the reason is related to Java history. Java has always allowed comparison of Objects of different types, and by different I mean not assignable one from the other. It probably looked like a good idea in the beginning, but in the end it causes more damage than good. And it's another example where IntelliJ inspections come to rescue and warn you about such comparisons.

Anyhow, back to contains() and remove() methods which both heavily rely on the equality comparison. Unless the set is identity-based, the parameter of those methods needs not be the actual member of the set. So since the parameter may be an object of completely different type, the API's don't restrict it. (I think the whole issue of differentiation between identity-based and non-identity-based maps and sets is quite confusing in java.util.*, I mean, look at WeakHashMap.)

Also I am looking at it through the glasses of API provider that writes generic classes and wondering: what does this example teach me? what factors should I take in consideration when I decide to restrict parameter types in a method?

P.S. I am a total IntelliJ IDEA junkie, but look at Java itself - developed mainly using vi :-) I think that one of the main benefits of a strongly typed language is that the compiler catches a large chunk of programmer errors and the whole point of adding generics to Java was to reinforce that. So I have mixed emotions regarding your closing sentence :-)

Nice post, Kevin...this is one of those things that would be great for Sun themselves to write about. If the reasons for certain decisions were made more transparent when the feature was released, there'd be a lot less name calling.

Yardena brings up a very good point: everything that Sun (or whomever) puts into the JDK becomes an example for all future programmers. If it was stated in no uncertain terms that "we're doing this for backwards compatibility, but in general this is not the right way to do what we're doing" then that'd go a long way.

"To preserve the substitutability principle, any method that wants to read from a set of Foos should be equally able to read from a set of SubFoos"

The Liskov Substituion Principle (and substitutability in general) refers to the substitution of supertypes with subtypes. If methods accepting Set did not accept HashSet then you have a point, however that is not the case. Set< Foo> is not a supertype of Set< SubFoo>. It is a parameterized variant, and therefore not substitutable.

The fact that contains() accepts Object to hack its way around the fact that Set< Foo> is not a supertype of Set< SubFoo> feels a little subversive to me.

" giovanni said...I think Set.contains() take an Object, not an E, since Object.equals take a Object not an E. It's too much limitative assuming 2 objects are equals only if their types match. "

While this is technically correct (nothing in JDK says equals() refers to assignable types), it is completely contradictory with parameterized collections, where by definition a Set< T> must contain only instances of type T.

Why was there no follow on bankruptcy then? The bailout of AIG FP went to (wow power leveling) hedge funds that bound credit swaps on Lehman failing or others betting on rating (wow power leveling) declines. AIG has drained over 100 billion from the government. Which had to go to (wow power leveling) those who bet on failures and downgrades. Many of whom (power leveling)were hedge funds. I-banks that had offsetting swaps needed the money from the AIG bailout or they would have been caught. Its an (wow powerleveling) insiders game and it takes just a little bit too much time for most people to think (wow gold) through where the AIG 100 billion bailout money went to, hedge funds and players, many of whom hire from the top ranks of DOJ, Fed, Treasury, etc. ZHANG XIAO CHEN

Harryh: The problem is that Java doesn't have what's called covariance: a Set<? extends E> isn't a Set<E>, because you can't add just any E to it. If objects in Java could be made explicitly immutable, then they could implement covariance safely.

Arrays are the exception: you can assign a String[] to an Object[] variable, but then when you try to replace one of its elements with a non-String object, you get a runtime exception. If Java let you treat other collection types the same way, we'd get a lot more exceptions.

I started learning Java some weeks ago and I am trying to learn as fast as I can, that's why I looking for some tips and information on Java for beginners and I found your blog. I found another blog by a Sildenafil guy but it was for advanced users

designed to.Shop for high quality wholesale efx bracelet products on DHgate and get worldwide delivery.silly bandz are a brand of silicone rubber bands formed into shapes including animals, These colorful silly bandzare made of silicone and die molded in many different

fun shapes. They come in all different shapes, like foods, letters and animalsand and so on.welcome to enjoy colorful silly bandz for free shipping.Hot sale products,Power Balance.free shipping.

You must also pay awareness of the bezel. Divers have in order to see simply how much time they are under the lake, so it's important that the particular diving watches use a unidirectional Audemars Piguet replica watches past timing bezel. An obvious face replica Breguet watches about divers watches is very important also. The apparent face lets you quickly go through replica IWC watches the face with the watch to see your moment. If you cannot quickly examine Cheap MBT shoes your moment, you may well overstay in the bottom. You would want to knockoff Rado watches pay awareness of the durability with the dive timepieces. Another factor to look closely at is the particular accuracy. You'll want to adopt the potency of the all scuba divers watches under consideration.

When you're tired, you want to relax after a stressful working hours, you need to have time to take care of the kids active. Please visit our website and play exciting flash games.Thanks you for sharing!Friv 4

What useful thing could you do inside of that method `doSomeReading`? Lets for argument's sake assume get and remove where typed would it make sense to call get or remove in that method? I don't think it does. Lets try to think of an example:doSomething(Set foos) { //Remove the French foo foos.remove(new Foo("plop"));}

Now lets think about when it would make sense to call doSemething. It would probably make sense to call it with a Set as it might contain a Foo, but does it make sense to it with Set? I doubt it does, why would such a set ever contain a new Foo("plop");?

Some people might argue that because new Foo("plop").equals(new SubFoo("plop")); I do wonder how often this actually occurs where the Set could not be the super in this case Set. Finally to handle such a edge case methods like getByEquals(Object o); could be added. At least this way it is very clear when typing is not being used.

I suspect this is just like the switch statement where the default behavior of falling through is the wrong thing to do in most cases. Effectively casting to Object is probably the wrong thing to do.

It is, and in fact I know quite a few people that have done it, including myself. Knowing a little bit about the JVM would help, however, but it's also something you can pick up in a few days. The only other aspect to take into account is that the documentation and other resources often make comparisons to Java and reference the JDK, but these aren't show stoppers.

So if you don't know Java the language, it's fine but I'd recommend getting to grips with

What the JVM is and basics of how a Java application is compiled and run.Understand the class path, what a jar is, what a class file is, etc.

Thanks for taking the time to discuss this, I feel strongly that love and read more on this topic. If possible, such as gain knowledge, would you mind updating your blog with additional information? It is very useful for me. e unblocked games

Superb. I really enjoyed very much with this article here. Really it is an amazing article I had ever read. I hope it will help a lot for all. Thank you so much for this amazing posts and please keep update like this excellent article.thank you for sharing such a great blog with us. expecting for your..Java Training in Chennai