My Links

News

Welcome to my blog! I'm a Sr. Software Development Engineer in the Seattle area, who has been performing C++/C#/Java development for over 20 years, but have definitely learned that there is always more to learn!

All thoughts and opinions expressed in my blog and my comments are my own and do not represent the thoughts of my employer.

vNext

Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders post can be found here.

On this post I will finish examining the System.Linq methods in the static class Enumerable by examining two extension methods Count() and DefaultIfEmpty(), and one static method Empty().

The Empty() static method

How many times have you had to return an empty collection from a method (either due to an error condition, or no items exist, etc.) and created an empty array or list?

Let’s look at a simple POCO to hold transfers between two bank accounts:

1: publicclass Transfer

2: {

3: publicstring FromAccount { get; set; }

4: publicstring ToAccount { get; set; }

5: publicdouble Amount { get; set; }

6: }

Now let’s say we have a data access object that is supposed to grab the outstanding transfers and return them. However, if the transfer service is down for maintenance, we just want to return an empty sequence of transfers.

We could of course return null, but returning the empty sequence is generally preferable than returning null. So we’ll return an empty sequence in the form of a list (or array would do as well):

The problem with this is that we are essentially wasting a memory allocation for something that will never change. If all we intend to do is return a read-only empty sequence of a given type, we can use LINQ’s empty sequence singleton factory to represent it and not waste a memory allocation every time:

Note that we’re calling the Empty<T>() static method off of the Enumerable class, this is the same class where all the extension methods for IEnumerable<T> are defined, but Empty is actually just a simple static method which returns a singleton empty sequence for type T.

The DefaultIfEmpty() extension method

So we’ve seen that the static Empty<T>() method can be used to generate singleton, read-only empty sequences of type T. But what happens if you want to return a sequence containing a single default item if a sequence is empty?

Why would you ever want to do this, you say? Well, for example what if you’re analyzing a list of test scores, but want to return a single scores of zero if the student has no scores so far:

1: var scores = new[] { 73, 77, 89, 90, 92, 77 };

2:

3: // If scores is non-empty, returns that sequence, if scores is empty, returns

4: // a sequence with a single item which is default(int) in this case (0).

5: foreach (var score in scores.DefaultIfEmpty())

6: {

7: Console.WriteLine("The score is: " + score);

8: }

Now, there is also a second form of DefaultIfEmpty() that lets you specify the default to use if the sequence is empty instead of relying on default(T) where T is the type held by the IEnumerable<T> sequence. For example, what if we want an average of all the scores, but want to return an average of 100 if no scores have been entered yet?

1: // will average the sequence if it's not empty, or if it is empty, will

2: // return 100 - or more precisely in this case, the average of a sequence

3: // containing just 100

4: var averageSoFar = scores.DefaultIfEmpty(100).Average();

Note that if you don’t specify the default value, it uses the default(T), which is the default value for type T – null for reference types, zero for numeric, etc.

The Count() extension method

This may seem like a no-brainer, right? The Count() method returns the number of the items in the sequence, correct? Yes, for the most part, but it also has some nice features worth mentioning.

First of all, calling Count() with no arguments (aside from the implicit source in the extension method syntax) returns the count of the sequence as you’d expect. This may seem somewhat redundant for types like List<T> which maintains a Count already, or arrays that contain a Length property:

1: var scoreArray = new[] { 73, 77, 89, 90, 92, 77 };

2: var scoreList = new List<int> { 73, 77, 89, 90, 92, 77 };

3:

4: // get count of array using Length property and the Count() extension method:

5: Console.WriteLine(scoreArray.Length);

6: Console.WriteLine(scoreArray.Count());

7:

8: // get count of list using Count property and the Count() extension method:

9: Console.WriteLine(scoreList.Count);

10: Console.WriteLine(scoreList.Count());

Note that (if we are using the System.Linq namespace) the list has both the Count property defined in the List<T> class, and the Count() extension method defined on IEnumerable<T>. This may seem redundant, but keep in mind that not all IEnumerable<T> implementations have a count defined, so the only way to find out the count of those sequences is to count them. In addition, this ensures that for any IEnumerable<T> there is a consistent method that can be called to get the count of the sequence.

Now, you may think that this is a big red flag, and that for List and arrays the Count() extension method would be an O(n) – linear – operation if it had to count all the items in the list, whereas Count and Length are O(1) – constant-time – operations. Fear not, however, the Count() extension method first checks to see if the type implementing IEnumerable<T> also implements ICollection<T> or ICollection (non-generic) as well.

Why is this important? Because any collection that implements ICollection or ICollection<T> must specify a Count property. Since List<T>, arrays, and many other collections implement ICollection<T>, this means the Count() extension method can cast them and call Count directly and avoid having to count the items.

Thus, calling Count() extension method on a list is slightly less efficient than calling Count property because it has to check the type and cast, but it’s still a constant-time – O(1) – operation. And, the nice thing is this means you can call Count() extension method on any IEnumerable<T> sequence for consistency and know that it will make the right choice for efficiency.

So, simple Count(), not very exciting eh? Well, Count() has an overload which takes a predicate (Func<T, bool>) which allows you to count all items in the sequence that meet the predicate condition:

1: var scoreList = new List<int> { 73, 77, 89, 90, 92, 77 };

2:

3: var numberOfBsOrBetter = scoreList.Count(s => s >= 80);

Once again this works for any sequence of IEnumerable<T> and is a very handy shortcut for:

1: // why type this when there's a handy count overload!

2: var numberOfBsOrBetter = scoreList.Where(s => s >= 80).Count();

So in addition to counting all items, this form of count is very handy for counting only the items in a collection that meet any given condition.

A side note on Count() versus Any()

The Count() extension method is very useful for checking for the number of items in a sequence, but if all you really want to know is if a sequence is non-empty, it’s probably more efficient to use the Any() extension method instead, because the Any() method halts as soon as it finds the first item that meets the criteria, instead of counting all items that meet the criteria:

1: // this counts all the items even if the first item is already a match

2: var hasAnyBsOrBetter = scoreList.Count(s => s >= 80) != 0;

3:

4: // do this instead, it will halt on the first match and not check further

5: var hasAnyBsOrBetter2 = scoreList.Any(s => s >= 80);

Note that Any(), like Count() can be called with no parameters (besides the implicit extension method source parameter) which tells you if the list contains any item at all (that is, if it’s not empty:

1: // this counts all the items in the sequence even if we only care about non-empty:

2: var isEmpty = someSequence.Count() == 0;

3:

4: // this only checks to make sure there is at least one item and stops

5: var isEmpty2 = !someSequence.Any();

If the sequence is something like a list or array it may not matter because they have a constant time – O(1) – count implementation, but for other sequences, it’s much more efficient to check Any() for non-empty, since Any() will halt after the first find.

In addition there is no performance short-cut on Count() with the predicate, so if you only care that there’s at least one match and not how many, Any() is again the better choice.

Summary

These are the last of the Enumerable methods I’ve been working through as part of my Little Wonders series. The Empty() static method is useful for using a singleton to represent an empty sequence of a type instead of creating empty collections which need to be garbage collected. The DefaultOrEmpty() is useful if you want to have a single default value stand in for an empty sequence. Finally, the Count() extension method is handy for counting either items in a sequence, or items in a sequence that match a given predicate.

Hope you’ve enjoyed the Little Wonders of the Enumerable static class, there are plenty more wonders to be seen in other areas so stay tuned!