Covariance and Contravariance in C#, Part Two: Array Covariance

Ever since C# 1.0, arrays where the element type is a reference type are covariant. This is perfectly legal:

Animal[] animals = new Giraffe[10];

Since Giraffe is smaller than Animal, and “make an array of” is a covariant operation on types, Giraffe[] is smaller than Animal[], so an instance fits into that variable.

Unfortunately, this particular kind of covariance is broken. It was added to the CLR because Java requires it and the CLR designers wanted to be able to support Java-like languages. We then up and added it to C# because it was in the CLR. This decision was quite controversial at the time and I am not very happy about it, but there’s nothing we can do about it now.

Why is this broken? Because it should always be legal to put a Turtle into an array of Animals. With array covariance in the language and runtime you cannot guarantee that an array of Animals can accept a Turtlebecause the backing store might actually be an array of Giraffes.

This means that we have turned a bug which could be caught by the compiler into one that can only be caught at runtime. This also means that every time you put an object into an array we have to do a run-time check to ensure that the type works out and throw an exception if it doesn’t. That’s potentially expensive if you’re putting a zillion of these things into the array.

Yuck.

Unfortunately, we’re stuck with this now. Giraffe[] is smaller than Animal[], and that’s just the way it goes.

I would like to take this opportunity to clarify some points brought up in comments to Part One.

First, by "subtype" and "supertype" I mean "is on the chain of base classes" for classes and "is on the tree of base interfaces" for interfaces. I do not mean the more general notion of "is substitutable for". And by “bigger than” and “smaller than” I explicitly do NOT mean “is a supertype of” and “is a subtype of”. It is the case that every subclass is smaller than its superclass, yes, but not vice versa. That is, it is not the case that every smaller type is a subtype of its larger type. Giraffe[] is smaller than both Animal[] and System.Array. Clearly Giraffe[] is a subtype of System.Array, but it is emphaticallynot a subtype of Animal[]. Therefore the “is smaller than” relationship I am defining is more general than the “is a kind of” relationship. I want to draw a distinction between assignment compatibility (smaller than) and inheritance (subtype of).

Next time we’ll discuss a kind of variance that we added to C# 2.0 which is not broken.

Probably not, but I am not an expert on the JIT. Why not try it both ways and examine the jitted code in the debugger? Then you’ll know.

Keep in mind that you will want to attach the debugger to the jitted code AFTER it jits. The jitter is allowed to generate more debugger-friendly code if it detects that the process is being debugged while jitting is going on. That might include turning off certain optimizations.

Interesting that this is actually not the case for Generics. I came across this a few months ago and, after banging my head on the desk several times, finally gave up and threw together an EnumerableWrapper<S, T>. I never did understand why I couldn’t pass an IEnumerable<B> for an IEnumerable<A> parameter (where B : A), but this explains it – that’s actually the non-broken way of doing it. 🙂

(Actually, in my case, B is a class and A is an interface that it implements, but I’m sure the logic is the same)

Of course, if these were all immutable types, then it wouldn’t be an issue, right?

You are somewhat anticipating where we’re going here. In a few more posts I will propose that we add exactly that kind of variance to interfaces in a future version of C#.

In general, immutability certainly helps make things more variant. As we will see in several more posts, since immutable objects are by definition read-only, and parameters may be covariant when they appear only on the "output" side, it is easy to create covariant immutable data structures if you have covariance in the type system.

My debugging skills are not particularly l33t, but the compiled version of the program (.Net 2.0) appears to be doing exactly the same assignment regardless whether the array’s element type is sealed or not. I fear we are stuck with runtime type checking on arrays with a reference type element type.

A link to this is posted on programming.reddit.com right now, with the title "It Should Always be Legal to Put a Turtle into an Array of Animals". Right when I read the title and saw blogs.msdn.com, I thought to myself "this has to be a post by Eric Lippert". I was right 🙂

How’s it going? (Are you going to try windsurfing tomorrow, with the storm coming in? 50+ MPH winds they say…) Did my project end up being useful to anyone? :-X

I agree with you that this situation is extremely irksome. However, it is one that we are stuck with.

However, it _is_ covariance. I have a partial ordering on types (CLR assignment compatibility) and an operator from type to type ("make an array type") which preserves the ordering. That’s the _definition_ of "covariant operator".

Perhaps you prefer "always-typesafe liskov-style substitution" as the basis for a partial ordering, rather than the sometimes-not-typesafe ordering that I’ve chosen. If that’s your choice of partial ordering, then I agree with you, making arrays is not covariant. But that’s not the partial ordering I’m choosing for the purposes of this series of articles.

You theory guys seem bound and determined to reject my choice of partial ordering today. That is your right, of course, but since I’m the guy who is going to have to implement this in a compiler that targets the CLR, choosing "assignment compatible in the CLR" is a highly pragmatic choice for me.

In this case the covariance is perfectly safe because you are treating arrays as read-only for the purposes of this method. It’s only when you try to write to the array that you have a potential problem, and you don’t do that here.

The way to get safe covariance is to separate out the read-only operations into interfaces and make them covaraint. If you wrote your program to take two IEnumerable<object> objects and we could somehow make IEnumerable<T> covariant in T then that would be perfectly type safe. IEnumerable does not allow you to write to the object, only read from it, so it is safe for covariance. You could then pass your IEnumerable<Giraffe> and IEnumerable<Animal> to the function and it would be just fine.

Of course, I have just spoiled the surprise; that is precisely where this series of posts is heading.

To elaborate on my original example, should the following code throw a compile-time error, a run-time error, or do copy-on-write? I really am curious to see what you think the right way is. I’ve been goofing around a lot with algebraic data types lately and contemplating the differences between ADTs and objects, and when each is appropriate. So far, the practical rule of thumb I’ve come up with is: If you find yourself doing RTTI, that object hierarchy should’ve been represented with ADTs.

Giraffe[] giraffes = new Giraffe[10];

…

// somewhere else

Animal[] animals = giraffes;

…

amimals.append(new Turtle()); /* not sure if it’s really valid C#, but you get the point */

Personally I love array covariance, and wish generic lists were covariant. IEnumerables would be syntactically more tolerable if they were addressable like lists. In other words if you don't want array covariance, a dedicated IReadableList might have been worthwhile.