This article is a two-part series regarding the LinqToWikipedia provider. The first article covers the basic concepts of Linq as well as the client usage of this particular provider while the second article covers the inner workings of the LinqToWikipedia provider to give you an understanding of what it takes to create your own IQueryable provider.

NOTE: You should download the latest build from Codeplex so you can follow along with the code samples.

Creating your own IQueryable provider

First off we have to ask ourselves the question..."Why would we want to create our own IQueryable provider?" It certainly isn't the easiest coding task that you'll undertake, the internet isn't exactly overflowing with end to end simple examples of how to accomplish this. What benefits do we really get out of doing this?

Well we already know the advantages of using Linq, we already covered that in my previous article; allowing for a consistent querying syntax, whether using Linq or Lambda expressions, no matter what the data source of the data gives us a pretty good first reason to build our own provider. Here are a couple more potential uses:

You have several applications in your organization that need to verify addresses. You could write a Linq provider that would translate queries to consume the United States Post Service API's to verify physical address locations.

Your organization has employee demographic data in various systems such as PeopleSoft, Active Directory, etc. You could write a Linq provider that allows applications to query and return employee data from these various systems in one result set.

You maintain websites for Real Estate agents and want to give potential customers the ability to look at the value of nearby homes in a certain area without leaving the agent's website. You could write a Linq provider that would connect to the Zillo.com API and return a wealth of information for the user to gauage the value of a potential home.

So we discussed some reasons why you may want to create your own IQueryable provider, let's walkthrough the LinqToWikipedia provider to show you how you can create you own provider.

NOTE:I am only going to show code highlights of important concepts, but you can always download the full source code from CodePlex to view the code in it's entirety. Additionally, to keep the examples simple I will only focus on the OpenSearch functionality of the provider

At a high level, there are two interfaces that you will need to implement in order to create your own IQueryable provider:

IQueryable<T>

We implement this interface so that our results will be able to be enumerated over

IQueryProvider

We implement this interface so that we can examine the query (expression tree) to perform any translations to create our actual query for whatever API we are calling into and then execute the query

Before we get into the actual provider code, we will fist start at the client call so we can follow a path through the execution process:

1: WikipediaContext datacontext = new WikipediaContext();

2:

3: var opensearch = (

4: from wikipedia in datacontext.OpenSearch

5: where wikipedia.Keyword == "Microsoft"

6: select wikipedia).Take(5);

Here you can see that we can created a new instance of the WikipediaContext class and then called to the datacontext.OpenSearch property. This allows the query to return as type IWikipediaQueryable<WikipediaOpenSearchResult> (this will be explained in the later sections). Then we are going to query the results to only inlude those where Keyword == "Microsoft". Finally, we are going to only want to return the first 5 records via the Take() extension method.

We have broken the query down so that you can see where everything more or less fits together as we review the code later.

Let's now look at the WikipediaContext class:

1: publicclass WikipediaContext

2: {

3: public IWikipediaQueryable<WikipediaOpenSearchResult> OpenSearch

4: {

5: get { returnnew WikipediaQueryable<WikipediaOpenSearchResult>(); }

6: }

7: }

Seems pretty simple. A class with one property OpenSearch that returns type IWikipediaQueryable<WikipediaOpenSearchResult>. In actuality, we are returning the concrete type WikipediaQueryable<WikipediaOpenSearchResult> that implements the IWikipediaQueryable<T> interface. This is important because that interface also implements IQueryable<T> which is what allows our provider to return itself as an IQueryable.

Up to this point we will be returning an enumerable list of WikipediaOpenSearchResult, which is the result class that we will be populating from the Wikipedia API.

1: publicclass WikipediaOpenSearchResult : IWikipediaOpenSearchResult

2: {

3: publicstring Text { get; set; }

4: publicstring Description { get; set; }

5: public Uri Url { get; set; }

6: public Uri ImageUrl { get; set; }

7: publicstring Keyword { get; set; }

8: }

Since we are implementing the IQueryable<T> interface, we are also implementing from IEnumerable<T>, IQueryable, and IEnumerable so we are obligated to provide implementations of the following:

IEnumerator IEnumerable.GetEnumerator()

IEnumerator<T> IEnumerable<T>.GetEnumerator()

Type IQueryable.ElementType

Expression IQueryable.Expression

IQueryProvider IQueryable.Provider

For the purposes of this tutorial we will only focus on the IEnumerator<T> IEnumerable<T>.GetEnumerator() method, this is where the acutal call to our custom query provider takes place.

Notice on lines 17 - 21 we are calling our customer query provider WikipediaQueryProvider. This is where the provider is passed the lambda expression via it's Execute method. Once the provider has translated and executed the query to the API, it will subsequently return an IEnuberable of type <T>, which is actually the WikipediaOpenSearchResult type in this case.

Since our custom query provider will implement IQueryProvider we must provide implementations for the following:

IQueryable CreateQuery(Expression expression)

IQueryable<TElement> CreateQuery<TElement>(Expression expression)

object Execute(Expression expression)

TResult Execute<TResult>(Expression expression)

For the purposes of this tutorial we will only focus on the object Execute(Expression expression) method, this is where the HTTP send and get request calls are made to the Wikipedia API.

Line 5 is returning a string result from the WikipediaOpenSearchRequest.Send method which is really just sending our formatted URL to the Wikipedia API and returning the results to the WikipediaOpenSearchResponse.Get method which parses the results and returns them as type IEnumerable<WikipediaOpenSearchResult>.

We'll cover the WikipediaOpenSearchResponse.Get method later, for now let's take a look at the WikipediaOpenSearchRequest.Send method.

This static method is building a URI based on the expression and then utlizing a helper method HttpRequest.Send which sends the URI using HttpWebRequest. The WikipediaOpenSearchUriBuilder class contains the code that will parse the expression so that we can translate it into a usable URI.

NOTE:Learning to parse lambda expressions is a topic worthy of being covered in a book. You will see some basic samples below on parsing expressions however the focus of this article is to demostrate how to create a Linq provider. I would suggest further exploration on learning the ins and outs of expression parsing to ensure that you have a good solid foundation on lambda expressions. Here are a few good starting points:

br /> We begin the process of building our querystring variable that will be sent to the Wikipedia API. Throughout the class we will be appending new items to the querystring as we begin parsing the expression tree.

Next, on line 18 we define our method which then takes over in building our URL from the lamda expression...

public Uri BuildUri(Expression expression)

and the actual parsing of the expression tree starts on line 29:

Visit((MethodCallExpression)expression);

This method in the ExpressionVisitor class begins evaluating the expression and subsequently takes us back to our overridden method on line 35:

protectedoverride Expression VisitMethodCall(MethodCallExpression m)

This method is where we can check the values contained in the expression, specifically in the Where() and Take() extension methods so that we can parse the expression and extract the values that we are looking for.

if (m.Method.Name.Equals("Where"))

and..

if (m.Method.Name.Equals("Take"))

Lines 39-48 are where we parse the lambda expression and look for the keyword that was supplied to the Where() extension method. This value is then appended to the querystring as the "search" value passed to the Wikipedia API.

_urlBuilder.Append("search=" + query);

Next on lines 49-58 we look for the value that was supplied to the Take() extension method. This value is then appended to the querystring as the "limit" value passed to the Wikipedia API.

So up to this point, we have translated the lambda expression into a URL that is ready to be passed to the Wikipedia API. We now need to get ready to accept the returned data (xml) from the Wikipedia API and then populate our WikipediaOpenSearchResult class with the data. So let's move on to the WikipediaOpenSearchResponse.Get method.

7: var descendants = from i in XDocument.Parse(xml).Descendants() select i;

8:

9: foreach (XElement element in descendants)

10: {

11: if (element.Name.LocalName.ToString().Equals("Item"))

12: {

13: WikipediaOpenSearchResult wsr = new WikipediaOpenSearchResult();

14:

15: var items = from x in element.Nodes()

16: select x;

17:

18: foreach (XElement item in items)

19: {

20: switch (item.Name.LocalName.ToString())

21: {

22: case"Text":

23: wsr.Text = item.Value;

24: break;

25: case"Description":

26: wsr.Description = item.Value;

27: break;

28: case"Url":

29: wsr.Url = new Uri(item.Value);

30: break;

31: case"Image":

32: wsr.ImageUrl = new Uri(item.Attribute("source").Value);

33: break;

34: }

35: }

36:

37: resultList.Add(wsr);

38: }

39: }

40:

41: return resultList;

42: }

43: }

This class has one static method that accepts a string of xml and returns a a generic List<> of type WikipediaOpenSearchResult as an IEnumerable. We are using LinqtoXml to parse out the data and return the elements that we need to populate our WikipediaOpenSearchResult class.

From here the IEnumerable is passed back on up the stack to the WikipediaQueryProvider and returned to the client as type WikipediaQueryable<WikipediaKeywordSearchResult>.

The error is caused because the Wikipedia API is thinking that you are coming in as a "bot". The fix that I have to add is a line of code an add a "UserAgent" header to the webrequest. You can either see the fix on codeplex and make the change yourself or I will update the release version with the fix.

Thank you for this work. It has helped me a lot. However, I do have a question: is it possible to obtain all the information from a page with a list of subjects, for instance: http://en.wikipedia.org/wiki/List_of_sports

I have been trying to change the parameters and even the whole WikipediaOpenSearchUriBuilder class but I was not able to get any info from the wikipedia sandbox and therefore, not sure how to do this.