DaedTech

What To Return: IEnumerable or IList?

I’ve received a couple of requests in various media to talk about this subject, with the general theme being “I want to return a bunch of things, so what type of bunch should I use?” I’m using the term “bunch” in sort of a folksy, tongue-in-cheek way, but also for a reason relating to precision — I can’t call it a list, collection or group without evoking specific connotations of what I’d be returning in the C# world (as those things are all type names or closely describe typenames). So, I’m using “bunch” to indicate that you want to return a “possibly-more-than-one.”

I suspect that the impetus for this question arises from something like a curt code review or offhand comment from some developer along the lines of “you should never return a list when you could return an IEnumerable.” The advice lacks nuance for whatever reason and, really, life is full of nuance. So when and where should you use what? Well, the stock consultant answer of “it depends” makes a good bit of sense. You’ll also probably get all kinds of different advice from different people, but I’ll describe how I decide and explain my reasoning.

First Of All, What Are These Things?

Before we go any further, it probably makes sense to describe quickly what each of these possible return values is. IList is probably simpler to describe. It’s a collection (I can use this because it inherits from ICollection) of objects that can be accessed via indexers, iterated over and (usually) rearranged. Some implementations of IList are readonly, others are fixed size, and others are variable size. The most common implementation, List, is basically a dynamic array for the sake of quick, easy understanding.

I’ve blogged about IEnumerable in the past and talked about how this is really a unique concept. Tl;dr version is that IEnumerable is not actually a collection at all (and it does not inherit from ICollection), but rather a combination of an algorithm and a promise. If I return an IEnumerable to you, what I’m really saying is “here’s something that when you ask it for the next element, it will figure out how to get it and then give you the element until you stop asking or there are none left.” In a lot of cases, something with return type IEnumerable will just be a list under the hood, in which case the “strategy” is just to give you the next thing in the list. But in some cases, the IEnumerable will be some kind of lazy loading scheme where each iteration calls a web service, hits a database, or for some reason invokes a 45 second Thread.Sleep. IList is (probably) a data structure; IEnumerable is a algorithm.

Since they’re different, there are cases when one or the other clearly makes sense.

When You’d Clearly Use IEnumerable

Given what I’ve said, IEnumerable (or perhaps IQueryable) is going to be your choice when you want deferred execution (you could theoretically implement IList in a way that provided deferred execution, but in my experience, this would violate the “principle of least surprise” for people working with your code and would be ill-suited since you have to implement the “Count” property). If you’re using Entity Framework or some other database loading scheme, and you want to leave it up the code calling yours when the query gets executed, return IEnumerable. In this fashion, when a client calls the method you’re writing, you can return IEnumerable, build them a query (say with Linq), and say “here, you can have this immediately with incredible performance, and it’s up to you when you actually want to execute this thing and start hammering away at the database with retrieval tasks that may take milliseconds or seconds.”

Another time that you would clearly want IEnumerable is when you want to tell clients of your method, “hey, this is not a data structure you can modify — you can only peek at what’s there. If you want your own thing to modify, make your own by slapping what we give you in a list.” To be less colloquial, you can return IEnumerable when you want to make it clear to consumers of your method that they cannot modify the original source of information. It’s important to understand that if you’re going to advertise this, you should probably exercise care in how the thing you’re returning will behave. What I mean is, don’t return IEnumerable and then give your clients something where they can modify the internal aggregation of the data (meaning, if you return IEnumerable don’t let them reorder their copy of it and have that action also reorder it in the place you’re storing it).

When you’d clearly use IList

By contrast, there are times when IList makes sense, and those are probably easier to understand. If, for instance, your clients want a concrete, tangible, and (generally) modifiable list of items, IList makes sense. If you want to return something with an ordering that matters and give them the ability to change that ordering, then give them a list. If they want to be able to walk the items from front to back and back to front, give them a list. If they want to be able to look up items by their position, give them a list. If they want to be able to add or remove items, give them a list. Any random accesses and you want to provide a list. Clearly, it’s a data structure you can wrap your head around easily — certainly more so than IEnumerable.

Good Polymorphic Practice

With the low hanging fruit out of the way, let’s dive into grayer areas. A rule of thumb that has served me well in OOP is “accept as generic as possible, return as specific as possible.” This is being as cooperative with client code as possible. Imagine if I write a method called “ScareBurglar()” that takes an Animal as argument and invokes the Animal’s “MakeNoise()” method. Now, imagine that instead of taking Animal as the parameter, ScareBurglar took Dog and invoked Dog.MakeNoise(). That works, I suppose, but what if I had a guard-bear? I think the bear could make some pretty scary noises, but I’ve pigeon-holed my clients by being too specific in what I accept. If MakeNoise() is a method on the base class, accept the base class so you can serve as many clients as possible.

On the flip side, it’s good to return very specific types for similar kinds of reasoning. If I have a “GetDog()” method that instantiates and returns a Dog, why pretend that it’s a general Animal? I mean, it’s always going to be a Dog anyway, so why force my clients that are interested in Dog to take an Animal and cast it? I’ve blogged previously about what I think of casting. Be specific. If your clients want it to be an animal, they can just declare the variable to which they’re assigning the return value as Animal.

So, with this rule of thumb in mind, it would suggest that returning lists is a good idea when you’re definitely going to return a list. If your implementation instantiates a list and returns that list, with no possibility of it being anything else, then you might want to return a list. Well, unless…

Understanding the Significance of Interfaces

A counter-consideration here is “am I programming to an interface or in a simple concrete type.” Why does this matter? Well, it can push back on what I mentioned in the last section. If I’m programming a class called “RandomNumberProvider” with a method “GetMeABunchOfNumbers()” that creates a list, adds a bunch of random numbers to it, and returns that list, then I should probably return List<int>. But what if I’m designing an interface called IProvideNumbers? Now there is no concrete implementation — no knowledge that what I’m returning is going to be implemented as List everywhere. I’m defining an abstraction, so perhaps I want to leave my options open. Sure RandomNumberProvider that implements the interface only uses a list. But how do I know I won’t later want a second implementation called “DeferredExecutionNumberProvider” that only pops numbers as they’re iterated by clients?

As a TDD practitioner, I find myself programming to interfaces. A lot. And so, I often find myself thinking, what are the postconditions and abilities I want to guarantee to clients across the board? This isn’t necessarily, itself, a by-product of TDD, but of programming to interfaces. And, with programming to interfaces, specifics can bite you at times. Interfaces are meant to allow flexibility and future-proofing, so getting really detailed in what you supply can tie your hands. If I promise only an IEnumerable, I can later define implementers that do all sorts of interesting things, but if I promise an IList, a lot of that flexibility (such as deferred execution schemes) go out the window.

The Client’s Burden

An interesting way to evaluate some of these tradeoffs is to contemplate what your client’s pain points might be if we guess wrong. Let’s say we go with IEnumerable as a return type but the client really just wants a IList (or even just List). How bad is the client’s burden? Well, if client only wants to access the objects, it can just awkwardly append .ToList() to the end of each call to the method and have exactly what it wants. If the client wants to modify the state of the grouping (e.g. put the items in a different order and have you cooperate), it’s pretty hosed and can’t really use your services. However, that latter case is addressed by my “when a list is a no brainer” section — if your clients want to do that, you need not to give then IEnumerable.

What about the flip side? If the client really wants an IEnumerable and you give them a list? Most likely they want IEnumerable for deferred execution purposes, and you will fail at that. There may be other reasons I’m not thinking of off the top, but it seems that erring when client wants an enumerable is kind of a deal-breaker for your code being useful.

Ugh, so what should I do?!?

Clear as mud? Well, problem is, it’s a complicated subject and I can only offer you my opinion by way of heuristics (unless you want to send me code or gists, and then I can offer concrete opinions and I’m actually happy to do that). At the broadest level, you should ask yourself what your client is going to be doing with the thing that you return and try to accommodate that. At the next broadest level, you should think to yourself, “do I want to provide the client a feature-rich experience at the cost of later flexibility or do I want to provide the client a more sparse set of behavior guarantees so that I can control more implementation details?”

It also pays to think of the things you’re returning in terms of what they should do (or have done to them), rather than what they are. This is the line of thinking that gets you to ask questions like “will clients need to perform random accesses or sorts,” but it lets you go beyond simple heuristics when engaged in design and really get to the heart of things. Think of what needs to be done, and then go looking for the data type that represents the smallest superset of those things (or, write your own, if nothing seems to fit).

I’ll leave off with what I’ve noticed myself doing in my own code. More often than not, when I’m communicating between application layers I tend to use a lot of interfaces and deal a lot in IEnumerable. When I’m implementing code within a layer, particularly the GUI/presentation layer in which ordering is often important, I favor collections and lists. This is especially true if there is no interface seem between the collaborating components. In these scenarios I’m more inclined to follow the “return the most specific thing possible” heuristic rather than the “be flexible in an interface” heuristic.

Another thing that I do is try to minimize the amount of collections that I pass around an application. The most common use case for passing around bunches of things is collections of data transfer objects, such as some method like “GetCustomersWithFirstName(string firstName).” Clearly that’s going to return a bunch of things. But in other places, I try to make aggregation an internal implementation detail to a class. Command-Query Separation helps with this. If I can, I don’t ask you for a collection, do things to it and hand it back. Instead I say “do this to your collection.”

And finally, when in doubt and all else seems to be a toss-up, I tend to favor promising the least (thus favoring future flexibility). So if I really can’t make a compelling case one way or the other for any reason, I’ll just say “you’re getting an IEnumerable because that makes maintenance programming likely to be less painful later.”

Arne Claassen

Unless you want your client to modify your internal list, I would pretty much never advocate returning an IList. An IList on an interface implies foo.SomeList.Add(x) works. If you are not intending them to mutate your state via a reference to your list, but they simply want to mutate their own copy, then it should be up to them to convert the IEnumerable into mutable version in the local context.

I do think that it is unfortunate that MS didn’t distinquish between lazy and materialized enumerables, since now you never know whether it is safe to do things like .Count() or .ElementAt() multiple times on the IEnumerable, but that’s no good reason to have a mutable accessor to what should be immutable.

dave falkner

I wondered if your point might be too obvious of a bad practice for the Internet at large to benefit from hearing, but I do know of a team with some people who could benefit from one more chance to learn that lesson.

If your class needed to expose the collection of things needs to encapsulate any of the add/remove/clear/sort operations on the list, then a List/IList should be out of the question.

I have a friend/coworker who very comically compares this to removing your stomach and telling someone else, “I’m hungry. Would you please refill this for me?”

Kam

No love for arrays? I tend to use them as lightweight fixed length return types when a count/length is desirable. I guess ToArray() adds overhead but my sets are usually pretty short.

Arne Claassen

I used to favor arrays over IEnumerable, but similar to IList, it allows for foo.SomeArray[i] = y, which is either undesirable or incorrect. IEnumerable very nicely says “you get a readonly sequence, and if you want to index it or mutate it, it’s your business”

http://www.daedtech.com/blog Erik Dietrich

I don’t find myself using arrays a whole lot, and I think it’s because I’m rarely writing code these days where I say to myself, “here’s something I know will never change in size” or where I say to myself “the performance gain on this matters to me.” But, I didn’t address that data type here because the question directed at me was about IEnumerable and IList.

http://www.daedtech.com/blog Erik Dietrich

It’d be interesting to see code bases where that distinction was made by convention. I bet that would actually be pretty helpful and indicative of sophistication (off the top, you could have a scheme where you inherited the interface and did ILazyEnumerable and IMaterializedEnumrable.

But yeah, coughing up internal guts that bit me at times when I was junior developer and was a big reason I gravitated away returning modifiable collections that I needed (or exposing them as properties). It’s hard to make this point to junior developers at times, if they haven’t been burned by how terribly this scales (lots of classes all working with the same single collection).

Arne Claassen

We’ve come to the convention at MindTouch that any IEnumerable returned is safe to be assumed as either materialized or fast, so all the linq operations from .Any(), .Count() to .ElementAt() are valid without concern.

For the few cases where we actually want to return a lazy sequence, it’s always via a method with a Enumerable postfix.

I had considered creating ILazyEnumerable and IMaterialized enumerable, but since we can’t add interfaces to existing types like arrays and lists, that would have required a conversion to a class implementing those for every use, which we felt was more onerous a burden than using naming convention.

http://www.daedtech.com/blog Erik Dietrich

I think the convention approach is a sound one (and clearly more thought out than my off the cuff suggestion). It brings to mind an interesting thought exercise for me, which is that I wonder if you could enforce that convention in a .NET code base with Code Contracts or some other design by contract scheme. That might not be trivial to do, but it would be an interesting way to prevent faux pas in the code by new developers.

http://blog.dox.com.au/ Ian Yates

What about IReadOnlyList? That would address some of the “but this implies .Add() or x[] = blah” concerns but still indicate to the client that .Count and indexing work and that the result is computed up front.

http://www.daedtech.com/blog Erik Dietrich

That’s not a type that’s really been on my radar, truth be told. I think you raise a good point, and thanks for the suggestion. If I have cases where I want random access/ordering but not mutability of the list.

When I think about why I haven’t gone searching for something like this, I think it’s probably because I don’t find myself passing collections around my application in which the ordering is significant. It doesn’t seem to come up a lot for me (but if it’s fresher in my mind, maybe I’ll find myself using it now).