.NET 3.0 has now been released, so we should all know it by now shouldn't we? Jeez, it doesn't seem like that long ago that .NET 2.0 came along. Well for those that don't realize .NET 3.0 actually contains quite a lot of new stuff, such as:

Windows Cardspace: Which provides a standards-based solution for working with and managing diverse digital identities

So as you can see there is a lot to be learned right there. I'm in the process of learning WPF/WCF but I am also interested in a little gem called LINQ, that I believe will be part of .NET 3.5 and Visual Studio "Orcas" (as its known now). LINQ will add new features to both C# and VB.NET. LINQ has three flavours:

LINQ is pretty cool, and I have been looking into it as of late, so I thought I would write an article about what I have learned in the LINQ/DLINQ/XLINQ areas, in the hopes that it may just help some of you good folk. This article will be focussed on LINQ, and is the first in a series of 3 proposed articles.

The proposed article series content will be as follows:

Part1 (this article) : will be all about standard LINQ, which is used to query in memory data objects such as List, arrays etc etc

See how similar this is. It's very powerful. So thats basically what LINQ allows us to do. And as one can imagine, DLINQ does similar stuff but with database objects, and XLINQ does queries/creation over XML documents.

LINQ also introduces lot of concepts that have really come from other functional programming languages, such as Haskell, LISP. Some of these new concepts are:

Lambdas (which kind of allow anonymous functions (methods in .NET) to be called over a sequence, a nice source on this is here)

To run the code supplied with this article you will need to install the May 2006 LINQ CTP which is available here, there is a new March 2007 CTP available, but its about 4GB for the full install (as its not just LINQ but the entire next generation of Visual Studio codenamed "Orcas") and quite fiddly, and probably going to change anyway, so the May 2006 LINQ CTP will be OK for the purpose of what this article is trying to demonstrate.

There are a number of interesting sources for LINQ and functional programming concepts. There is obviously the LINQ site and also some nice web examples, and also some other articles right here at Code Project. I'll list a few for those of you that are curious enough, and want more to look at:

I've now told you where to download LINQ, and pointed you at some other LINQ resources and further readings (which I urge you to do) so by now you are probably thinking "what's left to discuss?" Well the honest answer is that this article's content could probably all be found quite easily using the other resources shown above. But you never know, this article just might put a new spin on things, and help you to understand LINQ in a different way, as each person has a different writing style, so too, does each person have a different learning style. Some folk just may like this article. And to be honest I quite enjoy writing articles, so I'll continue in the hope that someone will like this article's contents.

I do, however, want people to know (just so people know that I am not selling myself as a purveyor of new knowledge), that all the information in this article is neither novel or really original, it can all be found easily using the web or by trawling the LINQ documentation. But sometimes it's nice to let someone else go through the learning for you and to learn from what they learned. See it as my journey into learning LINQ, which I am sharing with you here.

Although to be honest it's not really standard LINQ that excites me it's DLINQ/XLINQ, for which there is not so much freely available information. So that really is a case of trawling the documentation. But fear not, that is what I will be doing for you good folk in the next two articles. So stay tuned for those future articles. It just would not have made sense to write about those two without some sort of words about standard LINQ.

Before I delve into the nitty-gritty of LINQ, I would just like to mention a bit about the provided demo application. It looks like the figure shown below.

As you can see it comprises a left panel and a right area. On the left the user is able to view a PropertyGrid and a Numeric Up / Down control for each of the source Lists.

Where the user is able to use the Numeric Up / Down to examine the individual query source List data elements, where the PropertyGrid will always show the current list item as requested by the user. The query source List may not always be the same List, it will depend on the type of query being performed. However the PropertyGrid will always allow the user to examine the current query source List in the manner just described.

The main data query sources used for most queries will be simple based on List objects, which contain some really simple class objects. Let's have a look at an example data List objects

So it can be seen that the 1st List simply contains 10 Item objects, and the 2nd List simply contains 10 Order objects. But what do these Item and Order objects look like? As I previously said, they are very simply objects, that are really dumb, and simply there to showcase the talents of what can be done with LINQ.

The right hand area of the demo application shows the current query (actual LINQ syntax) and the results obtained.

This is pretty much how all the source data for the demo application is done, there may be some exceptions, where simple arrays of values are used instead of List objects, but I'll mention those when we come to them.

Think of the demo app as a mini LINQ playground.

So that's about all I think you'll need to know about the demo app, for the moment, so shall we continue?

As I previously stated standard LINQ (not DLINQ / XLINQ) operates on in memory data structures such as arrays, collections etc etc. LINQ actually does this using methods known as Standard Query Operators.

The new .NET 3.0 System.Query.Sequence static class declares a set of methods which exposes these Standard Query Operators.

The majority of the Standard Query Operators are extension methods that extend IEnumerable<T>.

I think the best way to tackle this subject is to introduce the LINQ Standard Query Operators. And give you a formal definition and an example of each one.

The LINQ specification details the following operators:

Restriction operators

Projection operators

Partitioning operators

Join operators

Concatenation operator

Ordering operators

Grouping operators

Set operators

Conversion operators

Equality operator

Element operators

Generation operators

Quantifiers

Aggregate operators

Where the Standard Query Operators operate on sequences. Any object that implements the interface IEnumerable<T> for some type T is considered a sequence of that type.

So you can see it's quite some beast. So what I'll attempt to do is give one formal definition and one example for each of the operators. I'll leave further reading (as I am showing one example, there are many possibilities for each operator) as an exercise for the reader.

Predicates

What's a predicate you say.

Well Wikipedia says:

"In formal semantics a predicate is an expression of the semantic type of sets. An equivalent formulation is that they are thought of as indicator functions of sets, i.e. functions from an entity to a truth value.

In predicate logic, a predicate can take the role as either a property or a relation between entities."

And the LINQ Standard Query Operators documentation says:

"The example below declares a local variable predicate of a delegate type that takes a Customer and returns bool. The local variable is assigned an anonymous method that returns true if the given customer is located in London. The delegate referenced by predicate is subsequently used to find all the customers in London."

That's how LINQ uses predicates. Basically the easiest way of thinking about what predicates are is to think about them as filters, that will evaluate to True or False, and as such will filter the IEnumerable<T> data source that the Expression is being applied to only contain the elements that match the filter (predicate).

But we digress. We needed to do this once, so that Predicates could be explained. But now that you've all got the hang of that we'll revisit the 1st Standard Query Operator.

The Restriction Query operator can be of either of the forms shown above, where the the first argument of the predicate function represents the element to test. The second argument, if present, represents the zero-based index of the element within the source sequence.

So let's see a real world example (using the attached demo project and the _itemListList local data source)

var iSmall =
from it in _itemList
where it.UnitPrice < 50.00M
select it;

A Little Word About "Var"

One thing that is of interest here, which is the use of the var within this example above. This looks reminiscent of days of old - VB, Flash, JavaScript - basically any not-strongly typed language. And those days were bad. These days we expect and use strongly typed objects. Even better these days we also have Generics, which bring us even more Type control over software we write. Yet here is LINQ code, which is after all new stuff, that will probably part of .NET 3.0. Is this good?

Consider this statement:

"It is also not required to declare type of query variable, because type inference automatically deduces the type when the var keyword is used."

What do we think of this? Well it's certainly better that what VB used to do, which was to determine the type at runtime. What LINQ does it to determine the type at compile time. So used wisely the var type can actually help developers and decrease coding time. Of course if you really want to be a stickler for hardcore typing then what one would have to do something like what is shown below:

EXAMPLE 1

IEnumerable<Item> iSmall =
from it in _itemList
where it.UnitPrice < 50.00M
select it;

Instead of:

EXAMPLE 2

var iSmall =
from it in _itemList
where it.UnitPrice < 50.00M
select it;

Can you see the difference?

In the 1st case, we were actually selecting the result type, and it which happens to be of Type Item, so we have to declare the result of the query as IEnumerable<Item> as this matches the query result. This is how to strongly type a query result. If, however, the query was changed to not return an Item Type, say, a string Type then we would need to change the query result type from IEnumerable<Item> to IEnumerable<string> we would have to remember to do this. Also if we had some complicated nested, joined, aggregate (SUM, COUNT) type operators as part of the query, it might be quite a complex type that we have to declare as a return type.

What would you guess the query result type be for this:

from c in customers
join o in orders on c.CustomerID equals o.CustomerID into co
from o in co.DefaultIfEmpty(emptyOrder)
select new { c.Name, o.OrderDate, o.Total };

or how about this one:

from c in customers
select new
{
c.CompanyName, YearGroups =
from o in c.Orders
group o by o.OrderDate.Year into yg
select new
{
Year = yg.Key, MonthGroups =
from o in yg
group o by o.OrderDate.Month into mg
select new
{
Month = mg.Key, Orders = mg
}
}
};

See, it gets tricky. We all know typing is good and is our friend. But sometimes it can also be fairly complicated as well.

In the 2nd example above, we simply use var instead of strongly typing the result of the query. This works, and the correct types are inferred, as they would be even for the most complicated query results. Also if we change the query, we don't have to change the var, as it will simply infer the new required types automatically. Job done.

It is personal preference, but var can save time and frustration. Just use it wisely and all should be cool.

The Projection Query operator can be of either of the forms shown above, where the first argument of the selector function represents the element to process. The second argument, if present, represents the zero based index of the element within the source sequence.

So lets see a real world example (using the attached demo project and the _itemListList local data source)

This is an interesting statement. The 1st part is the predicate i => i.UnitPrice >= 10, so only those items with a UnitPrice > 10, are actually selected. Then of those that are selected, we generate a new list (using the ToList() method) which only include ItemName and UnitPrice. Neat huh?

There is also SelectMany, which I have not included here. But after you install LINQ, you can play with this yourself.

Where this example gets all Item objects which have a Category of "Entertainment" and Concatenates that with the result of all Item objects which have a Category of "Food" and ensures there are no duplicates, by using the Distinct() method.

Order (OrderBy / ThenBy) Operator

The OrderBy / ThenBy family of operators order a sequence according to one or more keys.

So lets see a real world example (using the attached demo project and the _itemListList local data source)

var itemNamesByCategory =
from i in _itemList
group i by i.Category into g
select new { Category = g.Key, Items = g };

NOTE: This example is quite different from those supplied with the LINQ CTP, and I could not get those to work, as I think the syntax may have changed since Microsoft wrote the LINQ documentation (for example GroupBy did not seem to liked, at least not how they described for this specfic query). The example above does actually work.

This example gets all Item objects and simply groups them by Category. Then it selects the results into a new List (on the fly) where the Category is the key of the group result from the previous step, and the Items is set to be the current value of the items that matched the current grouping in the previous step. A little confusing, but let's have a look at the results that may help a little.

Where this example gets a unique List of all Item ItemID values and then Intersect this result, with a unique List of all Order OrderID values. The result is a List of ints that are common to both the Item and Order List.

Except

The Except operator produces the set difference between two sequences.

Where this example gets a unique List of all Item ItemID values and then Intersect this result, with a unique List of all Order OrderID values. The result is a List of ints that are common to both the Item and Order List.

Where this example gets all Item object ItemID and the same for all the Order object OrderID and sees if the entire sequence is equal. They are not, as the Item List contains more elements than the Order List, as such there are certain ItemID that dont appear in the Order List.

First

The First operator enumerates the source sequence and returns the first element for which the predicate function returns true. If no predicate function is specified, the First operator simply returns the first element of the sequence.

So let's see a real world example (using the attached demo project and the _itemListList local data source)

The FirstOrDefault operator enumerates the source sequence and returns the first element for which the predicate function returns true. If no predicate function is specified, the FirstOrDefault operator simply returns the first element of the sequence.

So let's see a real world example (using the attached demo project and the _itemListList local data source)

Where this example takes the 1st or default element of the List of Item that matches the predicate i.ItemName == itemName where itemName = "A Non existence Element". Which in this case doesnt match, so we get null returned instead.

Last

The Last operator enumerates the source sequence and returns the last element for which the predicate function returned true. If no predicate function is specified, the Last operator simply returns the last element of the sequence.

This works the same way as First, I'll leave this as an excercise for the reader.

LastOrDefault

The LastOrDefault operator returns the last element of a sequence, or a default value if no element is found.

The LastOrDefault operator enumerates the source sequence and returns the last element for which the predicate function returned true. If no predicate function is specified, the LastOrDefault operator simply returns the last element of the sequence.

This works the same way as FirstOrDefault, I'll leave this as an exercise for the reader.

Single

The Single operator enumerates the source sequence and returns the single element for which the predicate function returned true. If no predicate function is specified, the Single operator simply returns the single element of the sequence.

This works the same way as First, I'll leave this as an excercise for the reader.

SingleOrDefault

The SingleOrDefault operator returns the single element of a sequence, or a default value if no element is found.

The SingleOrDefault operator enumerates the source sequence and returns the single element for which the predicate function returned true. If no predicate function is specified, the SingleOrDefault operator simply returns the single element of the sequence.

This works the same way as FirstOrDefault, I'll leave this as an excercise for the reader.

ElementAt

The ElementAt operator returns the element at a given index in a sequence.

The ElementAt operator first checks whether the source sequence implements IList<T>. If so, the source sequence's implementation of IList<T> is used to obtain the element at the given index. Otherwise, the source sequence is enumerated until index elements have been skipped, and the element found at that position in the sequence is returned.

So let's see a real world example (using the attached demo project and the _itemListList local data source)

The ElementAtOrDefault operator first checks whether the source sequence implements IList<T>. If so, the source sequence's implementation of IList<T> is used to obtain the element at the given index. Otherwise, the source sequence is enumerated until index elements have been skipped, and the element found at that position in the sequence is returned.

So let's see a real world example (using the attached demo project and the _itemListList local data source)

Item itm = _itemList.ElementAtOrDefault(15);

Where this example simple attempts to fetch a non existent element from the List of Item, as such null is returned

DefaultIfEmpty

The DefaultIfEmpty operator supplies a default element for an empty sequence.

The Any operator enumerates the source sequence and returns true if any element satisfies the test given by the predicate. If no predicate function is specified the Any operator simply returns true if the source sequence contains any elements.The enumeration of the source sequence is terminated as soon as the result is known.

So let's see a real world example (using the attached demo project)

bool b = _itemList.Any(i => i.UnitPrice >= 400);

Where this example simply returns a bool if there is any Item with a UnitPrice >400

All

The All operator checks whether all elements of a sequence satisfy a condition.

The All operator enumerates the source sequence and returns true if no element fails the test given by the predicate.The enumeration of the source sequence is terminated as soon as the result is known.

So let's see a real world example (using the attached demo project and the _itemListList local data source)

var itemNamesByCategory =
from i in _itemList
group i by i.Category into g
where g.All(i => i.UnitsInStock > 0)
select new { Category = g.Key, Items = g };

Where this example uses a group operator, and then uses the All operator to fetch the Items that Match the predicate i.UnitsInStock >0.

NOTE: This example is quite different from those supplied with the LINQ CTP, and I could not get those to work, as I think the syntax may have changed since Microsoft wrote the LINQ documentation (for example GroupBy did not seem to liked, at least not how they described for this specfic query). The example above does actually work.

Contains

The Contains operator checks whether a sequence contains a given element.

publicstaticbool Contains<T>(
this IEnumerable<T> source,
T value);

The Contains operator first checks whether the source sequence implements ICollection<T>. If so, the Contains method in sequence's implementation of ICollection<T> is invoked to obtain the result. Otherwise, the source sequence is enumerated to determine if it contains an element with the given value. If a matching element is found, the enumeration of the source sequence is terminated at that point. The elements and the given value are compared using the default equality comparer, EqualityComparer<K>.Default.

So let's see a real world example (using the attached demo project and the _itemListList local data source)

bool b = _itemList.Contains(_itemList[0]);

Where this example simply returns a bool if the source List of Item contains _itemList[0] which it does.

Aggregate (Count / LongCount / Sum / ... ) Operators

The Set operators are made up of seven parts:

Count

The Count operator without a predicate first checks whether the source sequence implements ICollection. If so, the sequence's implementation of ICollection<t /> is used to obtain the element count. Otherwise, the source sequence is enumerated to count the number of elements. The Count operator with a predicate enumerates the source sequence and counts the number of elements for which the predicate function returns true.

So let's see a real world example (using the attached demo project and the _itemListList local data source)

The LongCount operator enumerates the source sequence and counts the number of elements for which the predicate function returns true. If no predicate function is specified the LongCount operator simply counts all elements. The count of elements is returned as a value of type long.

The Sum operator enumerates the source sequence, invokes the selector function for each element, and computes the sum of the resulting values. If no selector function is specified, the sum of the elements themselves is computed.

So let's see a real world example (using the attached demo project and the _itemListList local data source)

The Min operator enumerates the source sequence, invokes the selector function for each element, and finds the minimum of the resulting values. If no selector function is specified, the minimum of the elements themselves is computed. The values are compared using their implementation of the IComparable<T> interface, or, if the values do not implement that interface, the non-generic IComparable interface.

So let's see a real world example (using the attached demo project and the _itemListList local data source)

The Max operator enumerates the source sequence, invokes the selector function for each element, and finds the maximum of the resulting values. If no selector function is specified, the maximum of the elements themselves is computed. The values are compared using their implementation of the IComparable<T> interface, or, if the values do not implement that interface, the non-generic IComparable interface.

Max works the same way as Min, I'll leave it as an excercise for the reader.

Average

The Average operator computes the average of a sequence of numeric values.

The Average operator enumerates the source sequence, invokes the selector function for each element, and computes the average of the resulting values. If no selector function is specified, the average of the elements themselves is computed.

Max works the same way as Min, I'll leave it as an excercise for the reader.

The Aggregate operator with a seed value starts by assigning the seed value to an internal accumulator. It then enumerates the source sequence, repeatedly computing the next accumulator value by invoking the specified function with the current accumulator value as the first argument and the current sequence element as the second argument. The final accumulator value is returned as the result.

I have to say this one actually defeated me. I searched and searched for another example, as the LINQ documentation one is pretty dire. Check it out, this one is direct from LINQ documentation.

This is not a very nice example is it. This is kind of what we are getting with LINQ. Its very powerful, but some of it is pure crazy syntax. I mean what the heck is this one above telling someone. It's not very clear to me. Even the fabulous 101 LINQ Samples doesnt list an Aggregate operator example. So I guess we'll just have to gloss over this one for the time being.

Fold

Folding is nice concept straight out of the functional programming world, it allows us to fold in a new function to elements of a list (IEnumerable<T>) in our case. This is very powerful.

In this simple example, we have an array of which we want to get the product. We can simply use fold to literally fold some inline function (namely runningProduct * nextFactor) to form the result. Neat huh?

Well thats about it for the Standar Query Operators, if youve made it this far well done. It took me ages to write this, and it's probably taken you ages to read this. So I'll forgive you if you want to come back later. But the next bit is all about dynamically created (at runtime) queries. Up until now it's all been pre-compiled queries, which is all very well but not very realistic. In the real world we would want to do dynamic queries wouldn't we.

So far we have looked at static (defined at compile time) queries which is all very well but not really what we wI'll probably want to do for our real world applications.

It is also possible to create LINQ queries programatically using information the user may have entered or selected from a UI. This may be achieved use of the following principles:

By The Use Of Variables

We can simply introduce variables in the query, as shown here:

decimal priceVal = 50.00M;
//query will now use a variable so is dynamic
var iSmall = from it in _itemList
where it.UnitPrice < priceVal
select it;

So by introducing a simple variable, we can control the query. Quite simple.

QueryExpression Type (Mainly Used In DLINQ)

QueryExpression which according to the MSDN documentation only has one property called ExpressionOperator which Gets or sets the operator used in the expression.

If, however, we look in Visual Studio 2005 (assuming you have May 2006 LINQ CTP installed), we get a better picture.

It can be seen from this figure that we can literally provide any ExpressionOperator that we like. This is one part of the secret to creating dynamic LINQ queries at runtime.

So we could actually do something like (paying special attention to the use of the filter, which can be any string)

//where filter is a string which could be set to "city = 'London'"
string filter = "city = 'London'";
expression = QueryExpression.Where(expression,
QueryExpression.Lambda(filter, e));
//Finally, we create a new query based on that expression.
//where db inherits from DataContent and was automatically generated using
//the DLINQ tool Sqlmetal.exe
var query = db.CreateQuery<EmployeeView>(expression);

This example is actually a DLINQ query. But this should be possible in standard LINQ using the Queryable<T> which is new to the March 2007 CTP, though I dont have that CTP installed, and probably wont install it as its 4GB (as its the entire "Orcas" build) and is bound to change again.

The InteractiveQuery project which is installed as part of the May 2006 CTP is a good place to look for dynamic DLINQ queries.

IQuerablable Interface

To do runtime queries over in memory objects (LINQ) you WI'll need the new March 2007 CTP, which allows the user to do this by the use of the NEW Queryable<T> feature.

We now have the facilities to query an IQuerable object. Remember that IQuerable is only available in the new March 2007 CTP though. So let's delve a little deeper. We have this nice method sitting there, that allows us to do some sort of between dynamic query, on an IQuerable object. So let's have a look at how we could call this new method.

Suppose we had a simple array:

int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };

and that we then queried this array, like:

IEnumerable<int> query =
from n in numbers
where n %2 == 1
select n;

OK, so now we have a query result, which is of type IEnumerable<int>. Great. But what we could now do is something like:

IQueryable<int> query2 = Between(query.AsQueryable(), 0, 10);

So what's going on here? Well we re-use the results of the 1st query, which yielded a IEnumerable<int>, and then we use the .AsQueryable() to get the result from an IEnumerable<int> into an IQuerable object. This is all thanks to the new IQuerable interface, that is now part of the March 2007 CTP. The runtime manages executing IQuerable queries built on top of IEnumerable's.

I'm sure you'll agree we all have to learn how to do this at some stage. Personally I'm going to let the CTP mature a little more and the install instructions become a bit more clear (The current CTP is for the entire "Orcas" project, which is the next version of Visual Studio, so it's huge). But that's only my opinion, if you just cant wait for dynamic queries and a sneaky peak at "Orcas" then download the March 2007 CTP.

Well that's actually about it. As I said this article has probably not shown you much that you could not have learned from going to 101 LINQ Samples however, all information must come from somewhere. And perhaps some folks would not have known about 101 LINQ Samples or even LINQ unless they actually read this article, in which case I've probably done them a favour. Also this article does show actual working code, where as I found that some of the LINQ samples in the LINQ documentation, just did not work, or were so complicated that they would scare some folk. I've really tried to keep the examples in this article as simple as possible.

The next two articles (XLlNQ / DLINQ) should display some new material which should be of use to you all....I Promise I'll try and make them cover new material.

I would just like to ask, if you liked the article please vote for it, as it allows me to know if the article was at the right level or not.

Also, if you think that the next two proposed articles should include this much material or less material. Let me know, after all I want to write articles that actually help people out. I know this one had a lot of stuff in it. But it's new stuff, that is common for LINQ/DLINQ/XLINQ, so lots of information had to be covered. Anyway let me know your thoughts.

I have quite enjoyed constructing this article, and have been quite refreshed at just how easy LINQ is to actually use (well most of it, some of the Group and Aggregate operators are just plain nasty). I also had quite a nostalgic feeling as it reminded me of doing Haskell programming, which I would thoroughly recommend everyone avoid, as its simply crazy. But if you like lambdas, then you should get jiggy with curries and lazy evaluation and all that functional type of stuff. Its quite different actually.