Reimplementing LINQ to Objects: Part 10 – Any and All

Another day, another blog post. I should emphasize that this rate of posting is likely to be short-lived… although if I get into the habit of writing a post on the morning commute when I go back to work after the Christmas holidays, I could keep ploughing through until we’re done.

"Any" without a predicate returns whether there are any elements in the input sequence

"Any" with a predicate returns whether any elements in the input sequence match the predicate

"All" returns whether all the elements in the input sequence match the given predicate

Both operators use immediate execution – they don’t return until they’ve got the answer, basically.

Importantly, "All" has to read through the entire input sequence to return true, but can return as soon as it’s found a non-matching element; "Any" can return true as soon as it’s found a matching element, but has to iterate over the entire input sequence in order to return false. This gives rise to one very simple LINQ performance tip: it’s almost never a good idea to use a query like

// Don’t use thisif (query.Count() != 0)

That has to iterate over *all* the results in the query… when you really only care whether or not there are any results. So use "Any" instead:

// Use this insteadif (query.Any())

If this is part of a bigger LINQ to SQL query, it may not make a difference – but in LINQ to Objects it can certainly be a huge boon.

Anyway, let’s get on to testing the three methods…

What are we going to test?

Feeling virtuous tonight, I’ve even tested argument validation again… although it’s easy to get that right here, as we’re using immediate execution.

Beyond that, I’ve tested a few scenarios:

An empty sequence will return false with Any, but true with All. (Whatever the predicate is for All, there are no elements which fail it.)

A sequence with any elements at all will make the predicate-less Any return true.

If all the elements don’t match the predicate, both Any and All return false.

If some elements match the predicate, Any will return true but All will return false.

If all elements match the predicate, All will return true.

Those are all straightforward, so I won’t give the code. One final test is interesting though: we prove that Any returns as soon as it’s got the result by giving it a query which will throw an exception if it’s iterated over completely. The easiest way of doing this is to start out with a sequence of integers including 0, and then use Select with a projection which divides some constant value by each element. In this test case, I’ve given it a value which will match the predicate before the value which will cause the exception to be thrown:

It’s simplest to implement the predicate-less Any in terms of the predicated one – using a predicate which returns true for any element means that Any will return true for any element at all, which is what we want.

The inversions in the call to All take a minute to get your head round, but it’s basically De Morgan’s law in LINQ form: we effectively invert the predicate to find out if all of the elements don’t match the original predicate… then return the inverse. Due to the inversion, this still returns early in all the appropriate situations, too.

While we could do that, I’ve actually preferred a straightforward implementation of all of the separate methods:

Aside from anything else, this makes it obvious where the "early out" comes in each case – and also means that any stack traces generated are rather easier to understand. It would be quite odd from a client developer’s point of view to call Any but see All in the stack trace, or vice versa.

One interesting point to note is that I don’t actually use a foreach loop in Any – although I could, of course. Instead, I just get the iterator and then return whether the very first call to MoveNext indicates that there are any elements. I like the fact that reading this method it’s obvious (at least to me) that we really couldn’t care less what the value of the first element is – because we never ask for it.

Conclusion

Probably the most important lesson here is the advice to use Any (without a predicate) instead of Count when you can. The rest was pretty simple – although it’s always fun to see one operator implemented in terms of another.

So, what next? Possibly Single/SingleOrDefault/First/FirstOrDefault/Last/LastOrDefault. I might as well do them all together – partly as they’re so similar, and partly to emphasize the differences which do exist.

In Any (without predicate) you only call MoveNext. This is fine, but I guess the .Value property of the iterator could throw an exception if evaluated. I guess what I’m saying is if the standard .NET version of Any does look at the .Value property (perhaps implicitly by doing a foreach (item in source) {return true}) and yours doesn’t then they behave differently in this admittedly rather contrived edge case ;)

“It’s simplest to implement the predicate-less Any in terms of the predicated one – using a predicate which returns true for any element means that Any will return true for any element at all, which is what we want.”

In Linq to Objects, Any() and Any(x => True) do the same thing as long as the underlying Enumerator.Current doesn’t have side-effects or raise exceptions. This is because the predicate-less Any never actually calls Enumerator.Current. (Just like yours!)

You can’t actually implement one in terms of the other if you want to conform.

Yes, I checked, and the .NET implementation doesn’t do it either. The only reason I could think of is that maybe some ICollection implementations have an expensive Count function, and they wanted to play it safe. However, I don’t think I’ve ever run across any real code which has an expensive ICollection.Count implementation.